5

Looking around for python implementations of tries just so that I can understand what they are and how they work, I came across Justin Peel's patricia trie and found it very instructive: it's straightforward enough for one as new as I am to play around with it and learn from it.

However there's something I think I'm not understanding:

using Justin's class patricia() thus:

>>> p = patricia()
>>> words = ['foo','bar','baz']
>>> for x in words:
...     p.addWord(x)

I get a trie as a dictionary looking like this:

>>> p._d
{'b': ['a', {'r': ['', {}], 'z': ['', {}]}], 'f': ['oo', {}]}

addWord() and isWord() work as expected, but isPrefix() shows the following behavior which puzzles me:

>>> p.isPrefix('b')
True
>>> p.isPrefix('f')
True
>>> p.isPrefix('e')
False

good, as expected; and then

>>> p.isPrefix('ba')
True

also good, but then:

>>> p.isPrefix('bal')
True

and furthermore:

>>> p.isPrefix('ballance')
True
>>> p.isPrefix('ballancing act')
True

Something here I'm not understanding?

Community
  • 1
  • 1
jjon
  • 680
  • 1
  • 8
  • 23

1 Answers1

4

I believe the bug is in the following snippet of the code you're looking at:

       if w.startswith(node[0][:wlen-i],i):
            if wlen - i > len(node[0]):
                i += len(node[0])
                d = node[1]
            return True

it should actually be:

       if w.startswith(node[0][:wlen-i],i):
            if wlen - i > len(node[0]):
                i += len(node[0])
                d = node[1]
            else:
                return True
Alex Martelli
  • 854,459
  • 170
  • 1,222
  • 1,395
  • You're quite right, Alex. This looks like precisely the bug. I did warn that it wasn't tested very well. It was just something I scratched up and haven't really used since then. I fixed the code in my original posted answer. – Justin Peel Jun 26 '10 at 04:36
  • Alex, Thanks very much for finding that, now it seems to work as expected. And thanks Justin for building this. Having access to the root as a dictionary I found very instructive: I think in outlines, indented text (this is the result of training in the humanities, not experience with python; however, this was exactly what drew me to python in the first place), so being able to prettyprint the trie really helped illustrate what was going on. I hope you'll post any further work you do on this: benchmark testing or whatever. I'd also love to hear any ideas on potential uses for Patricia tries. – jjon Jun 26 '10 at 16:15
  • @jjon, you're welcome! BTW, a nice article on Radix Trees in general (with some mention of applications) is at http://en.wikipedia.org/wiki/Radix_tree – Alex Martelli Jun 26 '10 at 18:16