Glossary/Informed Search & Heuristics (A* Algorithm)
AI Fundamentals

Informed Search & Heuristics (A* Algorithm)

Using domain knowledge to search smarter — how A* finds optimal paths with fewer node expansions.


Definition

Informed search strategies use a heuristic function h(n) — an estimate of the cost from state n to the nearest goal — to guide search toward promising regions of the state space. Key algorithms include Greedy Best-First Search, A* Search, Iterative-Deepening A* (IDA*), and Recursive Best-First Search (RBFS). A* is the gold standard: it is complete, optimal, and optimally efficient given an admissible (or consistent) heuristic. GATE DA heavily tests admissibility, consistency, the A* optimality proof, and heuristic dominance.

Heuristic functions

A heuristic function h(n) estimates the cost of the cheapest path from node n to a goal. Two critical properties govern whether a heuristic makes A* correct:

PropertyDefinitionImplication
Admissibilityh(n) ≤ h*(n) for all n, where h*(n) is the true optimal cost to the goalA* tree search is guaranteed to find an optimal solution. The heuristic never overestimates.
Consistency (Monotonicity)h(n) ≤ c(n, a, n') + h(n') for every successor n' of n via action aA* graph search is optimal. Every node is expanded at most once. Consistency implies admissibility (stronger condition).

Consistency (triangle inequality): the estimated cost at n cannot exceed the actual cost to n' plus the estimated cost from n'. This mirrors the triangle inequality in geometry.

Admissibility ⊂ Consistency relationship

Consistency implies admissibility, but not vice versa. A consistent heuristic is also admissible. An admissible heuristic may not be consistent. GATE often asks: "Is an admissible heuristic always consistent?" — Answer: No.

Greedy Best-First Search

Greedy BFS expands the node that appears closest to the goal according to h(n). The frontier is a priority queue ordered by h(n) alone. It is fast but not optimal or complete (in tree search).

Greedy BFS evaluation function: only the heuristic estimate, ignoring the cost already paid to reach n.

PropertyGreedy BFS
Complete?No (tree search); Yes with explored set in finite spaces
Optimal?No — can follow a locally promising but globally suboptimal path
TimeO(b^m) worst case; much better in practice with a good heuristic
SpaceO(b^m) — keeps all nodes in memory

A* Search

A* combines UCS's cost-so-far g(n) with Greedy BFS's heuristic h(n). The priority queue is ordered by the evaluation function f(n) = g(n) + h(n). A* is the most widely used heuristic search algorithm.

A* evaluation function: g(n) = cost from start to n (known); h(n) = estimated cost from n to goal (heuristic). f(n) estimates the total cost of the cheapest solution through n.

A* never expands a node n with f(n) > C*, where C* is the optimal solution cost. This is the key to its efficiency. Every node with f < C* is guaranteed to be expanded; every node with f > C* is guaranteed not to be.

A* graph search (returns optimal path if h is admissible & consistent)

import heapq

def astar(problem, h):
    """A* graph search. h is a heuristic function h(state) -> estimated cost to goal."""
    start = problem.initial
    # Frontier: (f, g, node)
    frontier = [(h(start), 0, Node(start))]
    explored = {}   # state -> best g seen so far

    while frontier:
        f, g, node = heapq.heappop(frontier)
        s = node.state

        if problem.goal_test(s):
            return node  # optimal solution found

        # Skip if we've expanded this state with a lower cost already
        if s in explored and explored[s] <= g:
            continue
        explored[s] = g

        for action in problem.actions(s):
            child = node.expand(problem, action)
            g2 = g + problem.step_cost(s, action, child.state)
            f2 = g2 + h(child.state)
            if child.state not in explored or explored[child.state] > g2:
                heapq.heappush(frontier, (f2, g2, child))

    return None  # failure
PropertyA* (admissible h)A* (consistent h)
Complete?Yes (finite b, non-negative costs)Yes
Optimal?Yes (tree search)Yes (graph search)
TimeExponential in worst caseExponential in worst case
SpaceO(b^d) — keeps all generated nodesO(b^d)
Node expansionsEvery node with f(n) < C*Every node with f(n) < C*, each node at most once

Admissible heuristic design: relaxed problems

The most principled method to design admissible heuristics is to solve a relaxed version of the problem — one with fewer constraints. The optimal cost of the relaxed problem is always ≤ the optimal cost of the original problem, guaranteeing admissibility.

DomainOriginal constraintRelaxed problemResulting heuristic
8-puzzleA tile can slide to an adjacent blank squareA tile can move anywhereNumber of misplaced tiles h₁ (admissible)
8-puzzleA tile can slide to adjacent blankA tile can move to any adjacent squareManhattan distance h₂ = Σ |row_i - goal_row_i| + |col_i - goal_col_i| (admissible, dominates h₁)
Route planningMust follow roadsCan fly in straight lineEuclidean (straight-line) distance (admissible)

Heuristic dominance

h₂ dominates h₁ if h₂(n) ≥ h₁(n) for all n (both admissible). A* with h₂ expands no more nodes than A* with h₁. For 8-puzzle: Manhattan distance h₂ dominates misplaced-tiles h₁, so always prefer h₂. Given two admissible heuristics, h_max(n) = max(h₁(n), h₂(n)) is also admissible and dominates both — a frequently tested GATE fact.

IDA* and memory-bounded variants

A*'s main drawback is exponential memory usage. IDA* (Iterative Deepening A*) addresses this by running DFS with a cutoff on f(n): any node with f > threshold is pruned. The threshold increases from one iteration to the next (to the minimum f value that exceeded the previous threshold).

AlgorithmMemoryOptimal?Complete?
A*O(b^d)Yes (consistent h)Yes
IDA*O(bd)Yes (admissible h)Yes
RBFSO(bd)Yes (admissible h)Yes
SMA*Bounded by available memoryYes (if enough memory)Yes (if solution fits in memory)

GATE PYQ Spotlight

GATE DA Sample — Largest admissible h for A* (2-mark)

A state graph has known heuristic values and action costs. Starting from S, what is the largest value h(S) can take while still being admissible for A* graph search? Approach: Find the actual shortest path cost h*(S) from S to G (run Dijkstra). The answer is h(S) ≤ h*(S). This tests understanding that admissibility means no overestimation of true cost. Common GATE trap: confusing the heuristic value with the f-value (g+h).

GATE DA — Admissible heuristic combination (1-mark)

Let h₁ and h₂ be two admissible heuristics for A* search. Which expression is ALWAYS an admissible heuristic? (A) h₁(n) + h₂(n) (B) max(h₁(n), h₂(n)) (C) min(h₁(n), h₂(n)) (D) h₁(n) × h₂(n) Answer: (B) and (C) are both admissible. max(h₁, h₂) ≤ h* because both h₁ ≤ h* and h₂ ≤ h*. min(h₁, h₂) ≤ max ≤ h*, so also admissible. h₁ + h₂ may exceed h* (not admissible in general). max is preferred as it dominates both. The standard GATE answer is max(h₁, h₂).

GATE DA — Consistent heuristic and node re-expansion (1-mark)

True or False: If h is consistent, A* with graph search expands each node at most once. Answer: True. Consistency ensures f(n) is non-decreasing along any path, so when a node is first popped from the priority queue, it has the optimal g value. No re-expansion is needed. This is the proof of A* graph search optimality under consistency.

Practice questions

  1. What is the heuristic function h(n) in A* search and what properties must it satisfy for A* to be optimal? (Answer: h(n): estimated cost from node n to the goal. Must be admissible: h(n) ≤ h*(n) (true cost to goal) — never overestimates. If admissible, A* with tree search is optimal. For graph search (visited nodes not re-expanded), h must also be consistent (monotone): h(n) ≤ c(n,a,n') + h(n') — the estimated cost at n ≤ actual edge cost + estimated cost at successor. Every consistent heuristic is admissible; not vice versa.)
  2. What is the difference between A* and Greedy Best-First Search? (Answer: Greedy BFS: expands the node with lowest h(n) (estimated cost to goal only). Fast but suboptimal — ignores the actual path cost g(n), may go down a promising-looking but expensive path. A*: expands the node with lowest f(n) = g(n) + h(n) — balances actual cost so far and estimated remaining cost. Optimal (with admissible h). A* degenerates to Greedy BFS when g is ignored; degenerates to Uniform Cost Search when h=0.)
  3. What is the 8-puzzle heuristic — Manhattan distance — and why is it better than misplaced tiles? (Answer: Misplaced tiles: counts tiles not in goal position. Manhattan distance: Σ |row_i - row_goal| + |col_i - col_goal| for each tile. Manhattan distance is a better (closer to optimal) heuristic — it accounts for HOW FAR each tile is from its goal, not just whether it's misplaced. A tile three squares from its goal contributes 3 to Manhattan vs 1 to misplaced. Better heuristic = fewer nodes expanded = faster search. Both are admissible (Manhattan never overestimates moves needed).)
  4. What is iterative deepening A* (IDA*) and what problem does it solve? (Answer: A* problem: exponential memory usage — stores all frontier nodes. For complex problems, memory runs out before solution is found. IDA*: repeatedly depth-first search up to increasing f-cost thresholds. Start with threshold=h(root). DFS until f(n)>threshold → return minimum exceeded f value as new threshold. Repeat. Memory: O(d) — only current path stored, like DFS. Same optimal solutions as A* but uses linear memory. Trade-off: may re-expand nodes multiple times. Standard for memory-constrained heuristic search (15-puzzle, route planning on mobile).)
  5. What is a heuristic for the Travelling Salesman Problem (TSP) and is it admissible for A*? (Answer: Minimum Spanning Tree (MST) heuristic: compute MST of unvisited cities + the two cheapest edges connecting current city and one city back to start. Cost of MST ≤ optimal remaining tour (TSP tour is a spanning structure). Admissible: MST is always ≤ optimal tour cost. Efficient: Prim's/Kruskal's O(n² or n log n). Provides tight lower bound for branch-and-bound TSP solvers. Even with admissible heuristic, TSP A* is impractical for large n — the state space is n! and A* memory explodes.)

On LumiChats

Heuristic search is the conceptual foundation for how AI agents plan sequences of actions. Modern LLM-based agents use learned value estimates (analogous to heuristics) to prioritize which tool calls or reasoning steps to explore next.

Try it free

Try LumiChats for ₹69

39+ AI models. Study Mode with page-locked answers. Agent Mode with code execution. Pay only on days you use it.

Get Started — ₹69/day

Related Terms

4 terms