94

I am looking for a good Tree data structure class. I have come across this package, but since I am relatively new to Python (not programming), I dont know if there are any better ones out there.

I'd like to hear from the Pythonistas on here - do you have a favorite tree script that you regularly use and would recommend?

[Edit]

To clarify, by 'Tree', I mean a simple unordered tree (Hmm, thats a bit of a recursive definition - but hopefully, that clarifies things somewhat). Regarding what I need the tree for (i.e. use case). I am reading tree data from a flat file and I need to build a tree from the data and traverse all nodes in the tree.

morpheous
  • 16,270
  • 32
  • 89
  • 120
  • At least a triple duplicate...http://stackoverflow.com/questions/2482602/a-general-tree-implementation-in-python as well as http://stackoverflow.com/questions/2358045/how-can-i-implement-a-tree-in-python-are-there-any-built-in-data-structures-in?rq=1 – eric Dec 29 '14 at 13:58
  • 1
    In the mean time, there is a simple, lightweight and extensible tree data structure: http://anytree.readthedocs.io/en/latest/ – c0fec0de Mar 14 '17 at 07:25
  • @c0fec0de best to declare that you're the author of the anytree package that you're recommending. – jarmod Sep 11 '22 at 16:15

9 Answers9

79

You can build a nice tree of dicts of dicts like this:

import collections

def Tree():
    return collections.defaultdict(Tree)

It might not be exactly what you want but it's quite useful! Values are saved only in the leaf nodes. Here is an example of how it works:

>>> t = Tree()
>>> t
defaultdict(<function tree at 0x2142f50>, {})
>>> t[1] = "value"
>>> t[2][2] = "another value"
>>> t
defaultdict(<function tree at 0x2142f50>, {1: 'value', 2: defaultdict(<function tree at 0x2142f50>, {2: 'another value'})}) 

For more information take a look at the gist.

Max
  • 2,036
  • 2
  • 18
  • 27
Stefan
  • 1,246
  • 1
  • 9
  • 13
41

I found a module written by Brett Alistair Kromkamp which was not completed. I finished it and make it public on github and renamed it as treelib (original pyTree):

https://github.com/caesar0301/treelib

May it help you....

caesar0301
  • 1,913
  • 2
  • 22
  • 24
  • 5
    license is GPL, what a pity – denfromufa Dec 28 '14 at 04:26
  • 13
    This license was given when I was even unaware of what a license mean. I know its a simple but useful module. Since version 1.3.0, I redistribute it under Apache License. Now you can use it where you need it with declamation of the original copyright. – caesar0301 Dec 29 '14 at 05:02
40

Roll your own. For example, just model your tree as list of list. You should detail your specific need before people can provide better recommendation.

In response to HelloGoodbye's question, this is a sample code to iterate a tree.

def walk(node):
    """ iterate tree in pre-order depth-first search order """
    yield node
    for child in node.children:
        for n in walk(child):
            yield n

One catch is this recursive implementation is O(n log n). It works fine for all trees I have to deal with. Maybe the subgenerator in Python 3 would help.

Wai Yip Tung
  • 18,106
  • 10
  • 43
  • 47
  • How do you you loop over all elements in such a tree in a 'pythonic' way? – HelloGoodbye Jan 30 '14 at 12:31
  • Typically you iterate a tree using DFS or BFS. I usually implement a generator using DFS like def walk(tree): ... – Wai Yip Tung Jan 30 '14 at 17:53
  • 1
    What is DFS and BFS? These acronyms are new to me. – HelloGoodbye Jan 30 '14 at 22:33
  • DFS is depth-first search. It is used for tree traversals that are pre-order, in-order, or post-order. BFS is breadth-first search. It is used for tree traversals that are level-order. – Dane White Jan 31 '14 at 18:50
  • 1
    Added a sample code for DFS. – Wai Yip Tung Jan 31 '14 at 23:37
  • 1
    Depth-first search means that the children of a node are visited before its siblings. so if you have ` [ A, [ B, [C, D] ], [ E, [ F, G ] ] ] `, then, assuming you visit B before E, you also visit C and D before E. Breadth-first search means all the nodes at the same level are visited before any of their children, so B and E would both be visited before any of C, D, F, or G. – Mark Reed Aug 21 '15 at 17:16
10

Building on the answer given above with the single line Tree using defaultdict, you can make it a class. This will allow you to set up defaults in a constructor and build on it in other ways.

class Tree(defaultdict):
    def __call__(self):
        return Tree(self)

    def __init__(self, parent):
        self.parent = parent
        self.default_factory = self

This example allows you to make a back reference so that each node can refer to its parent in the tree.

>>> t = Tree(None)
>>> t[0][1][2] = 3
>>> t
defaultdict(defaultdict(..., {...}), {0: defaultdict(defaultdict(..., {...}), {1: defaultdict(defaultdict(..., {...}), {2: 3})})})
>>> t[0][1].parent
defaultdict(defaultdict(..., {...}), {1: defaultdict(defaultdict(..., {...}), {2: 3})})
>>> t2 = t[0][1]
>>> t2
defaultdict(defaultdict(..., {...}), {2: 3})
>>> t2[2]
3

Next, you could even override __setattr__ on class Tree so that when reassigning the parent, it removes it as a child from that parent. Lots of cool stuff with this pattern.

Community
  • 1
  • 1
Sandy Chapman
  • 11,133
  • 3
  • 58
  • 67
  • Parent is broken for t[0][1][2] in the above example. AttributeError: 'int' object has no attribute 'parent' – MatthieuBizien Feb 01 '14 at 02:36
  • @oao This is not broken. You're specifying t[0][1][2] = 3. Therefore t[0][1][2] is not going to be a defaultdict type, but a Number type (as defaultdict is used to make defaults for missing elements). If you want it to be a defaultdict, you need to use t[0][1][2] without the assignment. – Sandy Chapman Feb 04 '14 at 19:57
7

It might be worth writing your own tree wrapper based on an acyclic directed graph using the networkx library.

Jean-François Corbett
  • 37,420
  • 30
  • 139
  • 188
Andrew Walker
  • 40,984
  • 8
  • 62
  • 84
7

For a tree with ordered children, I'd usually do something kind of like this (though a little less generic, tailored to what I'm doing):

class TreeNode(list):

    def __init__(self, iterable=(), **attributes):
        self.attr = attributes
        list.__init__(self, iterable)

    def __repr__(self):
        return '%s(%s, %r)' % (type(self).__name__, list.__repr__(self),
            self.attr)

You could do something comparable with a dict or using DictMixin or it's more modern descendants if you want unordered children accessed by key.

Matt Anderson
  • 19,311
  • 11
  • 41
  • 57
4

Here's something I was working on.

class Tree:
    def __init__(self, value, *children):
        '''Singly linked tree, children do not know who their parent is.
        '''
        self.value = value
        self.children = tuple(children)

    @property
    def arguments(self):
        return (self.value,) + self.children

    def __eq__(self, tree):
        return self.arguments == tree.arguments

    def __repr__(self):
        argumentStr = ', '.join(map(repr, self.arguments))
        return '%s(%s)' % (self.__class__.__name__, argumentStr)

Use as such (numbers used as example values): t = Tree(1, Tree(2, Tree(4)), Tree(3, Tree(5)))

Aaron Robson
  • 149
  • 2
  • 8
3

Would BTrees help? They're part of the Zope Object Database code. Downloading the whole ZODB package is a bit of overkill, but I hope the BTrees module would be at least somewhat separable.

Community
  • 1
  • 1
Jenn D.
  • 1,045
  • 4
  • 13
  • 22
2

I think, from my own experience on problems with more advanced data structures, that the most important thing you can do here, is to get a good knowledge on the general concept of tress as data structures. If you understand the basic mechanism behind the concept it will be quite easy to implement the solution that fits your problem. There are a lot of good sources out there describing the concept. What "saved" me years ago on this particular problem was section 2.3 in "The Art of Computer Programming".