6

I've created a method of a TreeNode class that I'm wanting to return a flat list of an in order tree traversal

My sample tree is:

Sample tree data

The in order traversal output should be: [1, 1, 0, 2, 1, 3, 1, 1, 0]

but I'm getting: [2, 1, 1, 0, 1, 3, 1, 1, 0]

Here is my code:

def in_order_list(self, r = []):
    hasLeft = self.left is not None
    hasRight = self.right is not None
    if self is None:
        return
    else:
        if hasLeft:
            self.left.in_order_list(r)
        r.append(self.value)
        if hasRight:
            self.right.in_order_list(r)
    return r

Would anyone be able to give me a clue as to why this is happening?

Thanks Alex

Alex2134
  • 557
  • 7
  • 22

4 Answers4

4

Instead of calling self.left/right.in_order_list(), you're calling self.left/right.pre_order_list().

To accomplish what you want to do a generator function might be better (less memory-consuming and more pythonic) than to accumulate the values in a list:

class Tree(object):
    ...
    def in_order(self):
        if self.left is not None:
            for value in self.left.in_order():
                yield value
        yield self.value  #  <-- yielding the value of the node, not the node itself
        if self.right is not None:
            for value in self.right.in_order():
                yield value

...

tree = Tree(...)

in_order_values = list(tree.in_order())

That way, you don't have to create a list if you only want to iterate over the values:

for value in tree.in_order():
    ...

The algorithm works like this: the generator first descends recursively along the left branch of every node until it hits one with no left sub-node. Then it yields the value of the current node. After that it does the same on the right sub-node, but starting at the current node, not the root node. If we assume there are no cycles in the tree and no infinite branches, then there will definitely be leaf nodes, i.e. nodes with no left or right sub-node. IOW nodes, for which both base cases (self.left/right is None) are reached. Therefore the recursive calls will return, hopefully before we're out of memory or hit the stack limit.

The loop over self.left/right.in_order() is necessary due to the fact that what the call to in_order() returns is a generator, hence the name generator function. The returned generator must be exhausted somehow, e.g. through a loop. In the body of the loop we re-yield the values up one level, where they're re-yielded again, until they reach top level. There we use the values.

If you want to retrieve the nodes themself instead of only their value fields, you could do it like this:

class Tree(object):
    ...
    def in_order(self):
        if self.left is not None:
            for node in self.left.in_order():
                yield node
        yield self  #  <-- yielding the node itself
        if self.right is not None:
            for node in self.right.in_order():
                yield node

You probably want to do this, because not only can you still access the values of the nodes:

for node in tree.in_order():
    do_something_with(node.value)

but also you can do whatever you want with the nodes:

for node in tree.in_order():
    whatever(node.some_other_attribute)
pillmuncher
  • 10,094
  • 2
  • 35
  • 33
  • does the generator for left and right `yield` the node's value or the node object? – Alex2134 Feb 25 '12 at 06:00
  • Trying to get my head around what's happening, I'd say `for left` and `for right` recursive calls they `yield` the `node object` - is that correct? – Alex2134 Feb 25 '12 at 06:21
3

You can write this sort of thing really neatly as a generator, and avoid having to handle lists and appending:

 def in_order(tree):
    if tree is None: return

    for value in in_order(tree.left): yield value
    yield tree.value
    for value in in_order(tree.right): yield value

For example:

>>> class Node(object):
...     def __init__(self, value, left=None, right=None):
...             self.value, self.left, self.right = value, left, right
... 
>>> x = Node(3)
>>> x.left = Node(2)
>>> x.left.left = Node(1)
>>> x.left.left.left = Node(1)
>>> x.left.left.right = Node(0)
>>> x.left.right = Node(1)
>>> x.right = Node(1)
>>> x.right.left = Node(1)
>>> x.right.right = Node(0)
>>> 
>>> in_order(x)
<generator object in_order at 0xb742dbbc>
>>> list(in_order(x))
[1, 1, 0, 2, 1, 3, 1, 1, 0]
Katriel
  • 120,462
  • 19
  • 136
  • 170
  • thanks for your solution using a _generator_ reading on this it looks like a generator takes care of local variables between calls. Whilst using a _generator_ is powerful, would you say it's the most classic implementation for something like this? It is very elegant though. Thanks – Alex2134 Feb 25 '12 at 05:44
2

It might be a problem with the default argument for r = []

Example:

def foo(list=[]):
    list.append(1)
    print list


foo()
>>> [1]

foo()
>>> [1,1]

Python is keeping the same list over multiple function calls.

Try making the start of your function something like:

def in_order_list(self, r=None):
    if r is None:
        r = []

You might want to post the rest of your code, so we can know what that pre_order function does.

campos.ddc
  • 741
  • 5
  • 12
  • Hi @Matt and @campos, both of you asking me about what the `pre_order` function does has highlighted my mistake! The function should call itself `in_order_list` not the `pre_order` function. This method now works! Thanks @campos.ddc for the insight on multiple calls of the `list`. – Alex2134 Feb 24 '12 at 00:15
1

A) firstly as noted by campos.ddc, having the [] be assigned the value to the r parameter is problematic. For details of this consult this answer on Stackoverflow (it is a very common bug)

B) it would seem your "if self is None:" test is redundant, if self was None, you wouldn't be able to call the in_order_list method (assuming this is a method in a class, and not a standalone function)

C) The code could be simplified:

def in_order_list(self, r = None):
    if not r:
        r = []
    if self.left:
        self.left.in_order_list(r)
    r.append(self.value)
    if self.right:
        self.right.in_order_list(r)

D) Questions that are likely "homework" questions, should be labelled as such. >:(

Community
  • 1
  • 1
Adam Parkin
  • 17,891
  • 17
  • 66
  • 87
  • Use `if r is None` not `if not r` for testing against `None`ness -- otherwise you could get confused e.g. `0`. (This doesn't matter in this particular case, but is a good rule in general.) – Katriel Feb 24 '12 at 00:47
  • I disagree, the expectation is that r is a list, so the only way it can be evaluated to False is if A) it's None, or B) is empty. A list containing the value 0 would evaluate to True, and if r is the value 0 that would indicate an error on the caller's part. In either case, `if not r` more concise & readable (IMHO) than `if r is None:` – Adam Parkin Feb 24 '12 at 00:59
  • yes, but my point is that for comparing against singletons you should use `is`. To me, `if not r` immediately makes me think about non-`[]` boolean-`False` values. – Katriel Feb 24 '12 at 11:46
  • In other words, `if r is None` tests against the value immediately above, so it's clear that the only way it will succeed is if the default argument isn't overridden. `if not r` doesn't test against that value, so you have to think through the cases to see that it's correct. – Katriel Feb 24 '12 at 11:47
  • "so it's clear that the only way it will succeed is if the default argument isn't overridden" - or the user passes in `None`. We're splitting hairs here, I don't really see either as particularly right or wrong. – Adam Parkin Feb 24 '12 at 16:22
  • Thanks @AdamParkin. A) As campos.ddc mentioned an invaluable point - read your link. It's not homework as such, as I'm not taking a computer science course, just a personal project. – Alex2134 Feb 25 '12 at 05:52