2

i am going to implement suffix tree for given string, i think it should delcared like this

struct suffix
{

char  letter;
suffix * left,*right; 

};
suffix *insert(suffix *node,char *s){

}

//here i am going to construct tree with all occurances of substrings and characters but dont know how use left and right part,is this tree sorted and arranged by strict ordering of characters like binary search tree?or?please help me ,i dont want to use some code on online,i need to implement it myselft,so please give me some hints,some little code

2 Answers2

4

The wikipedia is a start.

However actually doing it is to understand the suffix tree construction algorithm, the Weiner or the Ukkonen algorithm.

The Ukkonen algorithm is simpler: http://europa.zbh.uni-hamburg.de/pubs/pdf/GieKur1997.pdf

Also this link is less academic and more practical: http://www.cise.ufl.edu/~sahni/dsaaj/enrich/c16/suffix.htm

Try to start understanding the algorithm.

After that good luck, this is one of the toughest data structures. The only thing worst is the compressed and optimized versions of the suffix tree.

But all great programmers like big challenges.

4

Have a look at the Wikipedia description:

Suffix tree

Note, first of all, that a suffix tree is not a binary tree so your implementation outline is fundamentally flawed.

Next, it’s not enough to store a single character per node / branch; a suffix tree branch represents a substring which can be longer than a single character. It’s also usual to store just the start and end indices of the substring within the string, and not the string itself; otherwise the suffix tree would consume a lot of unnecessary memory.

Lastly, don’t use pointers here. They buy you nothing and only cause trouble. Use something like a boost::container::vector<suffix> instead (I’d suggest a std::vector<suffix>, but unfortunately standard library containers cannot deal with incomplete types).

Community
  • 1
  • 1
Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
  • so it means that i should use loop in insert method?one loop for entire string,another loop for looking all subsequent substring and add it to node? –  Mar 14 '12 at 14:33
  • @dato Well you certainly won’t get around a loop. – Konrad Rudolph Mar 14 '12 at 14:41
  • sorry for reply later,because i was out of home,when i create vectori can't access content of structure why?for example in struct suffix i declared string s,how can i access this string? –  Mar 14 '12 at 14:47
  • 1
    There is way too little information in your comment to answer this. Open a new question with the specific problem. – Konrad Rudolph Mar 14 '12 at 15:28