Depth-First Search (DFS)

Intuition

Imagine exploring a maze. At every fork you commit to one passage and follow it as far as it goes, ignoring the others for now. Only when you hit a dead end do you walk back to the last fork and try the next unexplored passage. That retreat is called backtracking, and it is the whole idea behind depth-first search: go as deep as possible along one branch, and back up only when you must.

DFS remembers where to back up to using a stack. The most natural way to write it is with recursion, where the language's own call stack does the remembering for you: each dfs(node) call sits on the stack while it explores that node's neighbours, and pops off — backtracking to its caller — the moment it finishes. The deepest call on the stack is always the place you are right now.

How It Works

Recursive DFS is four lines of idea:

  1. Mark the current node visited and record it.
  2. For each neighbour of the current node:
  3.   If that neighbour is unvisited, recurse into it (go one level deeper).
  4. When the loop ends, return — this is the backtrack to the caller.

On a tree every node is reachable exactly once, so step 3's "if unvisited" check never rejects anything — there are no back edges. On a general graph the check matters: it stops the search from re-entering a node it has already seen, which is what prevents an infinite loop on a cycle. An edge that points at an already-visited node is a back edge (if that node is still on the call stack, i.e. an ancestor) or a cross edge (if it has already finished) — DFS simply skips it.

Each vertex is entered once and each edge is examined once, giving O(V + E) time. The extra space is the recursion stack, which is O(V) in the worst case (a single deep path).

Step By Step

The default configuration is a balanced binary tree of 7 nodes (0–6):

       0
      / \
     1   2
    / \ / \
   3  4 5  6

Adjacency list: 0 → [1, 2], 1 → [3, 4], 2 → [5, 6], and 3, 4, 5, 6 are leaves.

Watch the call stack grow as DFS descends and shrink as it backtracks:

  1. dfs(0) — mark 0 visited (#1). Stack: [0]. Recurse into its first child, 1.
  2. dfs(1) — mark 1 visited (#2). Stack: [0, 1]. Recurse into 3.
  3. dfs(3) — mark 3 visited (#3). Stack: [0, 1, 3]. 3 is a leaf → return, backtrack to 1.
  4. dfs(4) — back in 1's loop, recurse into 4. Mark 4 visited (#4). Stack: [0, 1, 4]. Leaf → return to 1; 1's neighbours are done → return to 0.
  5. dfs(2) — back in 0's loop, recurse into 2. Mark 2 visited (#5). Stack: [0, 2]. Recurse into 5.
  6. dfs(5) — mark 5 visited (#6). Leaf → return to 2.
  7. dfs(6) — recurse into 6. Mark 6 visited (#7). Leaf → return to 2; 2 is done → return to 0; 0 is done → the search ends.

Discovery order: 0, 1, 3, 4, 2, 5, 6

Notice that node 1's entire subtree (3 and 4) is fully explored before sibling 2 is ever touched — that is the depth-first guarantee, and the reason the call stack reaches [0, 1, 3] before it ever holds [0, 2].

Tree vs. Graph mode

  • Tree mode (default): a tree has no cycles and one path to each node, so DFS never meets an already-visited neighbour — no back edges, no skips.
  • Graph mode: edges may form cycles or reconverge. The visited set guards the recursion; edges to already-visited nodes are shown as back/cross edges and skipped. On the Simple preset the edge 2 → 5 is a cross edge, because 5 is reached first through 4 → 5. On the Cyclic preset, 2 → 0 is a back edge to an ancestor still on the call stack.

Complexity

CaseTimeSpace
AllO(V + E)O(V)
  • Time: each vertex is entered once and each edge examined once.
  • Space: the recursion (call) stack holds at most one frame per node on the current path — O(V), reached when the graph is a single long chain.

Edge Cases

  • Single node, no edges: dfs(start) marks it, finds no neighbours, returns immediately.
  • Disconnected graph: a DFS from one start node only reaches its connected component; unreachable nodes stay unvisited. To cover everything, restart DFS from each still-unvisited node.
  • Cycles: the visited check turns the cycle's closing edge into a back edge and skips it, so the recursion terminates.
  • Deep / skewed input: a long path makes the call stack as tall as the path — in real code this is where recursive DFS can overflow the stack and an explicit stack is preferred.

Common Mistakes

  1. Dropping the visited check on a graph: without it, DFS revisits nodes and loops forever around any cycle.
  2. Confusing DFS with BFS: DFS goes deep first (a stack / recursion); BFS explores level by level (a queue). The data structure is the difference.
  3. Assuming a single "correct" order: the visit order depends on how neighbours are listed. 0 → [1, 2] and 0 → [2, 1] are both valid DFS traversals with different orders.
  4. Marking visited too late: mark a node the moment you enter dfs(node). Marking only after the neighbour loop lets a cycle pull you back in before the mark lands.