I am using suffixtree to retrive matched substring. The readme
file contains an example as --
>>> import SuffixTree.SubstringDict
>>> d = SubstringDict.SubstringDict()
>>> d['foobar'] = 1
>>> d['barfoo'] = 2
>>> d['forget'] = 3
>>> d['oo']
[1, 2]
The query returns values of all the strings matched withoo
. But I didn't find any way to retrieve values as well. For example I want the result of the kind --
>>> d['oo']
[['foobar', 1],
['barfoo', 2]]
This class contains only the methods ['__doc__', '__getitem__', '__init__', '__module__', '__setitem__', '_addToTree', '_lookupKeys', 'debug']
, and I could not used that to reach my desired output.
I found an alternate solution to get my required result, which I got the idea from the method _dictWordsTree()
in the source file. I have rewritten the code as --
>>> import SuffixTree.SubstringDict
>>> d = SubstringDict.SubstringDict()
>>> d['foobar'] = ['foobar', 1]
>>> d['barfoo'] = ['barfoo', 2]
>>> d['forget'] = ['forget', 3]
>>> d['fo']
[['foobar', 1], ['barfoo', 2], ['forget', 3]]
And I got the desired output. How can I get my required output without additionally storing the key as value(I have large dataset ~20MB)? (I searched for similar 1, 2, 3, 4, 5 threads, but it didn't help me).