2

What types of algorithms would work the quickest for a search of just what's being searched? I realize this is getting quite close to asking how Google-instant search works, but I'm no Algorithms expert and I've been becoming increasingly interested in them. Is a search like this done using suffix trees or something similar? I guess I'm just interested in querying little strings as opposed to lots of crawled data the way Google does.

Thanks much for any input!

Jon Phenow
  • 3,974
  • 5
  • 26
  • 30
  • 1
    You may also like reading Map-reduce paper by Google. http://labs.google.com/papers/mapreduce.html may be reverse index as well http://en.wikipedia.org/wiki/Reverse_index – Nishant Feb 15 '11 at 04:47
  • All great reads by each answer and this comment... damn I think I set myself for more of a forum of answers... I'll read a bit more and try to pick one I like the most. This type of stuff has always baffled me and now I finally get some understanding. – Jon Phenow Feb 17 '11 at 16:09

3 Answers3

2

For those types of queries you can store the data in a Trie or a kind of Trie tree.

Murilo Vasconcelos
  • 4,677
  • 24
  • 27
1

Tree is fine, but you don't need to put your array in a multidimensional array. Here is my way to do it with big arrays in JS.

You need to sort the array.

Jump to the middle of the array. Loop: If the array item is smaller then tosearch, jump to the middle of the upper half; Else if the array item is bigger then tosearch, jump to the middle of the lower half; else you found it. etc.

var maxstep=Math.abs((Math.log(0.5)-Math.log(array.length))/Math.log(2)-1);

function searchinterval(tosearch,array){
         var len=array.length,
             pos=range=len/2,
             index=Math.round(pos),
             maxstep=.49999;
         for(var i=0;i<=maxstep;i++){
              range/=2;
              if(tosearch<array[index]){
                pos-=range;
                }
              else if(tosearch>array[index]){
                pos+=range;
                }
              else{
                return index;
                //you found it
                }
              index=Math.round(pos);
              }
         return false;
         }

If tosearch not exists in the Array this function is slow. Means seven loops for an array length of 200 I'm not certain with the maximal number of steps or step size.

PS: Think I found out the maximum of steps:(thanks Maxima)

Log(0.5)-Log(array_length))/Log(2) -1); 
B.F.
  • 477
  • 6
  • 9
1

If it is just for trying small set of strings, then of the standard search algorithms is a good place to start. Searching each character at a time and finding the common set of characters between two sets, is best accomplished using Dynamic programming technicals and one such example is finding Longest common subsequence.

Senthil Kumaran
  • 54,681
  • 14
  • 94
  • 131