26

In an algorithms course I'm taking, it's said that depth-first search (DFS) is far more space efficient than breadth-first search (BFS).

Why is that?

Although they are basically doing the same thing, in DFS we're stacking the current node's successors while in BFS we're enqueueing the successors.

nbro
  • 15,395
  • 32
  • 113
  • 196
HSN
  • 783
  • 3
  • 11
  • 20

3 Answers3

68

Your confusion is stemming from the fact that you apparently assume that DFS algorithm can be obtained from BFS algorithm by replacing the FIFO queue with a LIFO stack.

This is a popular misconception - it is simply not true. The classic DFS algorithm cannot be obtained by replacing the BFS queue with a stack. The difference between these algorithms is much more significant.

If you take a BFS algorithm and simply replace the FIFO queue with a LIFO stack, you will obtain something that can be called a pseudo-DFS algorithm. This pseudo-DFS algorithm will indeed correctly reproduce the DFS vertex forward traversal sequence, but it will not have DFS space efficiency and it will not support DFS backward traversal (backtracking).

Meanwhile, the true classic DFS cannot be obtained from BFS by such a naive queue-to-stack replacement. The classic DFS is a completely different algorithm with significantly different core structure. True DFS is a genuinely recursive algorithm that uses stack for backtracking purposes, not for storing the vertex discovery "front" (as is the case in BFS). The most immediate consequence of that is that in DFS algorithm the maximum stack depth is equal to the maximum distance from the origin vertex in the DFS traversal. In BFS algorithm (as in the aforementioned pseudo-DFS) the maximum queue size is equal to the width of the largest vertex discovery front.

The most prominent and extreme example that illustrates the difference in peak memory consumption between DFS and BFS (as well as pseudo-DFS) is a star-graph: a single central vertex surrounded by a large number (say, 1000) of peripheral vertices, with each peripheral vertex connected to the central vertex by an edge. If you run BFS on this graph using the central vertex as origin, the queue size will immediately jump to 1000. The same thing will obviously happen if you use pseudo-DFS (i.e. if you simply replace the queue with a stack). But classic DFS algorithm will need stack depth of only 1 (!) to traverse this entire graph. See the difference? 1000 versus 1. This is what is meant by better space efficiency of DFS.

Basically, take any book on algorithms, find a description of classic DFS and see how it works. You will notice that the difference between BFS and DFS is far more extensive that a mere queue vs. stack.

P.S. It should also be said that one can build an example of a graph that will have smaller peak memory consumption under BFS. So the statement about better space efficiency of DFS should be seen as something that might apply "on average" to some implied class of "nice" graphs.

AnT stands with Russia
  • 312,472
  • 42
  • 525
  • 765
  • 7
    Something worth noting - Pseudo-DFS should give you `O(depth * branching factor)` space, as opposed to `O(depth)` for proper DFS, which is still better than `O(width)`. Seems your answer says something similar, although I think this is more brief and clear. – Bernhard Barker Dec 06 '13 at 18:20
  • 2
    Note that although true DFS is often better implemented using recursion, it is not a necessary characteristic of true DFS. You can certainly implement true DFS without any recursion. The defining characteristic that separates true DFS and what you call pseudo DFS is how they use the stack. True DFS use it to store backtracking information in contrast to pseudo DFS which used the stack to store the vertex discovery front. – Lie Ryan Dec 14 '13 at 05:45
  • 1
    @Lie Ryan: I don't want to mix algorithm itself and implementation of algorithm. When I say "recursion" is refer to algorithmic recursion, not language recursion. I.e. by recursion I mean any algorithm that is based on divide-and-conquer strategy and uses non-constant-size stack (LIFO) to store postponed tasks. Such algorithm is inherently "recursive" by nature, regardless of whether it is implemented through language recursion or manual stack. – AnT stands with Russia Dec 14 '13 at 05:51
  • 1
    Recursion is still using a stack even if it is not explicit... (or uses the heap if your language requires it) – Thomas Eding Apr 20 '16 at 23:29
  • 2
    According to Artificial Intelligence - A Modern Approach, DFS is in fact BFS with a LIFO queue instead of a FIFO queue... The algorithm you are decribing they call backtracking search. – Bart Louwers May 01 '16 at 22:02
  • @ultrabowser: Well, in a formal article (book, thesis etc.) one can define any term to mean anything, as long as the term is properly defined at the beginning. However, DFS is an old and well-established term. Redefining it (albeit formally possible) is not a reasonable thing to do, even if what I called "pseudo-DFS" shares certain similarities with "classic" DFS. But in the end, it is a question to the authors of the book. – AnT stands with Russia May 02 '16 at 15:09
  • If you're traversing a general graph (not a tree), you'll need it to keep track of visited nodes. With it the memory consumption is O(#nodes), so I don't see any advantage for DFS over BFS in terms of memory. If this answer was specifically about trees, then you should clarify that. – max Nov 13 '16 at 09:36
  • @max: If you are talking about the ability to label the vertics, then *both* algorithms depend on being able to label the vertices. In that sense they are similar, which means that we can take this out of consideration and focus on how they differ. – AnT stands with Russia Nov 13 '16 at 15:47
  • 1
    Yes, I was referring to labeling vertices. And I am not sure it makes sense to take it out of consideration completely: since labeling will require O(#nodes) space for either algorithm, the differences in additional space used by each algorithm becomes far less important. In other words, DFS has a huge advantage over BFS on trees, where it takes O(1) instead of O(N) space. But DFS has no meaningful advantage on general graphs where both take O(N) space. The (at best) a small constant factor space advantage would rarely be of any practical importance. – max Nov 13 '16 at 17:17
  • Hi can someone post a pseudocode or code or at least link to code for the above classic DFS ? I am able to understand but I am unable to implement it. when i have just one child then i loose track of the siblings.. – Harish Kayarohanam Feb 03 '18 at 15:53
  • Can someone check if my approach is correct ? The idea is to keep track of parent in stack instead of storing all siblings in the stack. the parent of a node in the stack is the element below it in the stack. so with this approach we need not store siblings due to lost parent anymore. – Harish Kayarohanam Feb 03 '18 at 16:28
  • I tried to implement it. And it worked with just size of stack any time at most equal to depth. but i see that each time when i go a take a node in stack, i have to iterate the children to see which are all visited. is there any way i can avoid doing it each time for the same node ? – Harish Kayarohanam Feb 03 '18 at 21:06
7

In DFS you need only space for linear to the depth O(log(n)) on a fully balanced tree while BFS (Breadth-first search) needs O(n) (The widest part of the tree is the lowest depth which has in a binary tree n/2 nodes).

Example:

               1
              / \  
             /   \  
            /     \ 
           /       \
          /         \  
         /           \  
        /             \ 
       /               \
       2               2 
      / \             / \ 
     /   \           /   \  
    /     \         /     \  
   /       \       /       \  
   3       3       3       3
  / \     / \     / \     / \ 
 /   \   /   \   /   \   /   \  
 4   4   4   4   4   4   4   4
/ \ / \ / \ / \ / \ / \ / \ / \ 
5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 

DFS needs space: 4
BFS needs in the second last row space 8

And it gets worse if the branching factor is higher

MrSmith42
  • 9,961
  • 6
  • 38
  • 49
  • 2
    For BFS, you need to store each element in the last row as well (because after visiting all elements in the second-last row, the queue will contain all elements in the last row), so it'll be 16, not 8. If you know the maximum depth of your tree beforehand, you can probably optimize this, but it doesn't happen in standard implementations. – Bernhard Barker Dec 06 '13 at 17:59
  • @Dukeling: Depends a little on your implementation. If you store the nodes still to visit or the nodes you have visited but not their children. But it makes no difference in the `O`-Notation `O(n/2) = O(n/4) = O(n)`. – MrSmith42 Dec 06 '13 at 18:08
  • 1
    "the nodes you have visited but not their children" - I don't think I've ever seen that done before, but it does actually make perfect sense (more sense, in fact, than the alternative). – Bernhard Barker Dec 06 '13 at 18:14
3

In DFS, the space used is O(h) where h is the height of the tree.

In BFS, the space used is O(w) where w is the 'width' of the tree.

In typical binary trees (i.e. random binary trees), w = Omega(n) and h = O(sqrt(n)).

In balanced trees, w = Omega(n) and h = O(log n).

W. Hoare.
  • 39
  • 1
  • In addition to this, also see [this](http://stackoverflow.com/a/3332994/202504) answer comparing DFS and BFS. – jmiserez Dec 06 '13 at 16:58
  • 1
    [this Q&A](http://cs.stackexchange.com/q/6342/2909) suggests the expected height of a random binary tree is O(log n), not O(sqrt(n)). – Will Ness Mar 16 '14 at 13:56