0

Intuitively I understand that I keep on stack pairs like (node, iterator), but still I can not get to a working solution.

Sergey Potapov
  • 3,819
  • 3
  • 27
  • 46
  • which language are you using, can you post the code which you worked on ? – zenwraight Dec 31 '17 at 03:39
  • Every time you pop the stack, increment the child iterator. If there is a next child, push that onto the stack and repeat; if not, pop the current node. This follows directly from the special case for binary trees. – meowgoesthedog Dec 31 '17 at 04:19

1 Answers1

3

You can always translate a recursive algorithm to one with an explicit stack. Start with recursive code:

void traverse(NODE *p) {
  if (p) {
    for (int i = 0; i < p->n_children; ++i) {
      traverse(p->child[i]);
    }
  }
}

Replace the recursive call:

struct {
  NODE *p;
  int i;
} stk[MAX_RECURSION_DEPTH];
int sp = 0;

void traverse(NODE *p) {
 start:
  if (p) {
    for (int i = 0; i < p->n_children; ++i) {
      // Save local values on stack.
      stk[sp].p = p;
      stk[sp++].i = i;
      // Simulate recursive call.  
      p = p->child[i];        
      goto start;
      // Goto this label for return.
     rtn:
    }
  }
  // Simulate recursive return, restoring from stack if not empty.
  if (sp == 0) return;
  p = stk[--sp].p;
  i = stk[sp].i;
  goto rtn;
}

There you have it: an explicit stack implementation that has to work as long as the recursive version did. It's the same thing.

Now if you like, we an do some algebra to eliminate the gotos. First we can rewrite the for as a while and refactor the rtn label

void traverse(NODE *p) {
  int i;
 start:
  if (p) {
    i = 0;
   rtn_2:
    while (i < p->n_children) {
      stk[sp].p = p;
      stk[sp++].i = i;
      p = p->child[i];        
      goto start;
    }
  }
  if (sp == 0) return;
  p = stk[--sp].p;
  i = stk[sp].i;
  ++i;
  goto rtn_2;
}

Note the ++i inside the while was dead code, so was safe to drop.

Now note that the body of the while never executes more than once. It can be replaced with an if. We can also replace goto rtn_2 with the code that it causes to execute.

void traverse(NODE *p) {
  int i;
 start:
  if (p) {
    i = 0;
    if (i < p->n_children) {
      stk[sp].p = p;
      stk[sp++].i = i;
      p = p->child[i];        
      goto start;
    }
  }
  for (;;) {
    if (sp == 0) return;
    p = stk[--sp].p;
    i = stk[sp].i;
    ++i;
    if (i < p->n_children) {
      stk[sp].p = p;
      stk[sp++].i = i;
      p = p->child[i];        
      goto start;
    }
  }
}

Finally, we can get rid of the start label by using a loop instead:

void traverse(NODE *p) {
  int i;
  for (;;) {
    if (p) {
      i = 0;
      if (i < p->n_children) {
        stk[sp].p = p;
        stk[sp++].i = i;
        p = p->child[i];        
        continue;
      }
    }
    for (;;) {
      if (sp == 0) return;
      p = stk[--sp].p;
      i = stk[sp].i;
      ++i;
      if (i < p->n_children) {
        stk[sp].p = p;
        stk[sp++].i = i;
        p = p->child[i];        
        break;
      }
    }
  }
}

Another cleanup is to note that i is always 0 in the first if, and the continue is actually implementing a nested loop, which we can make explicit. There's also a redundant stk[sp].p = p; that can be deleted. It's just copying a value onto the stack that's already there:

void traverse(NODE *p) {
  for (;;) {
    while (p && p->n_children > 0) {
      stk[sp].p = p;
      stk[sp++].i = 0;
      p = p->child[0];        
    }
    for (;;) {
      if (sp == 0) return;
      p = stk[--sp].p;
      int i = stk[sp].i + 1;
      if (i < p->n_children) {
        stk[sp++].i = i; // stk[sp].p = p; was redundant, so deleted
        p = p->child[i];        
        break;
      }
    }
  }
}

It's possible to make the code a bit prettier, but I'll leave that to you. The thing to note is that there was no intuition or trying to imagine what pointers are doing. We just did algebra on the code, and a reasonably nice implementation resulted. I haven't run it, but unless I made an algebra mistake (which is possible), this ought to "just work."

Note this is a bit different from the typical stack-based DFS you'll see in textbooks. Those push all the children of a newly found node on the stack, which must be done rightmost child first to get a normal DFS order.

Instead here we are pushing the parent along with an integer saying which child should be searched next. This is the node + iterator you mentioned. It's a bit more complex but also more efficient in stack size. The max size of our stack is O(D) where D is the max depth of the tree. The size of the stack in the textbook algorithm is O(KD) where K is the max number of children a node can have.

Gene
  • 46,253
  • 4
  • 58
  • 96