7

Say I have an object:

var names = ["john", "jane", "al", "mary", "zane" ... 1000+ Names]

I want to create an auto-suggest to search these names.

What's the most efficient way of doing this? I've read creating a trie or ternary data structure is best, but I'm not sure how to implement these in js.

Any thoughts?

doremi
  • 14,921
  • 30
  • 93
  • 148
  • I'd start by investing the memory up-front to sort them, then you can start breaking them down with `slice` and a binary search. Keep the potential amount if data to search through as low as possible. (though I'm not sure why you have 1000+ names in a JavaScript array...) – Brad Christie Feb 24 '11 at 23:31
  • There are varying meanings of "efficient": memory efficient, time efficient. Which would you prefer? – David Weiser Feb 24 '11 at 23:33
  • Time efficient. Speed is of most importance for my application. – doremi Feb 24 '11 at 23:35
  • if it's only 1000 or so names, does linear search perform that badly? – Jesse Cohen Feb 24 '11 at 23:42
  • OK. Let's stipulate it's 5000. What I'm after is how to efficiently search a large object. – doremi Feb 24 '11 at 23:44
  • actually you could sort and use a binary search as well to quickly get to the first element, then linear search from there. but doing a profiling is a good point, so as not to optimize unnecessarily. – mellamokb Feb 24 '11 at 23:44
  • If you are getting the data from the server , why don't you sort it at the server side? and then as @mellamokb suggested , perform a binary search. – Clyde Lobo Feb 25 '11 at 04:29

5 Answers5

9

A trie would be a good solution. Your data set would look something like this:

{"j":
    {"a":
        ["jacob", "jane", ..],
    {"o":
        ["john", "joesph", ..],
    ..
};

You would index character by character as many levels deep as reasonable (so that the innermost arrays have maybe between 20-30 entries.) Then do a simple search on the array stored at the innermost layer.

You can generate this by looping through your collection of names, then check if the particular index entry exists. If so, go down one layer, check if the next characters exists, etc., until you reach the deepest level. Then insert into the array, or start a new array if there isn't one. If a character level doesn't exist while you are adding a new name, then create it. Then you would want to cache the final result instead of regenerating it on every request.

mellamokb
  • 56,094
  • 12
  • 110
  • 136
  • 3
    i wonder whether there is a performance difference to indexing the arrays by char or having each trie node be a static `array(27)` that you then access by letter index. it would be an interesting test, my thought is that an array declared like `[]` might be treated like a hash table, in which case you'd have a lot of over head here. interesting to benchmark – Jesse Cohen Feb 24 '11 at 23:49
  • Your structure is not correct. Currently you have a syntax error. It should be: `{"j": {"a": ["jacob", "jane", ..], {"o": ["john", "joesph", ..}, ..};` (note the pointy brackets, you want an object, not an array). – Felix Kling Feb 25 '11 at 01:08
  • @Jesse, all arrays are treated as hash tables by the language specification (array indices are converted to strings). However, some browsers optimise the implementation of arrays in the underlying native code, but in that case either `[]` or `Array(...)` will be treated as an "array". – David Tang Feb 25 '11 at 04:29
1

I think that a trie is a natural way to think about doing auto-suggest from a large pool -- what you have to do is a prefix search, and tries excel at this. That said, I'm really not sure how the underlying implementation of arrays works in javascript, so you'd have to benchmark it and see at what point a trie becomes efficient. That is, there is probably some number n below which it makes more sense to do linear search versus using a trie. To top that all off, since each browser uses a different js engine, the efficiency of this will probably differ.

That said, here is a trie implementation in js: http://notdennisbyrne.blogspot.com/2008/12/javascript-trie-implementation.html

If js arrays work the way I think they might (i.e. as fancy hash tables, meaning even doing trienode[10] will end up being a hash table lookup), then another simple option to consider is to store every prefix of a word in an array. e.g. for the name john you'd insert j jo joh john into an array, this would give you constant time lookup but of course use a lot of memory.

Jesse Cohen
  • 4,010
  • 22
  • 25
1

Why don't you sort the array using Array.sort()and then perform a binary search on the same ?

Here is a code demonstrating Binary Search in js.

http://www.nczonline.net/blog/2009/09/01/computer-science-in-javascript-binary-search/

Also check the comments on the page, it has a more efficient implementation of the binary search

Clyde Lobo
  • 9,126
  • 7
  • 34
  • 61
  • I think this would work great for exact searches, but if you want to suggest strings rather than search for them I'm not sure how useful this will be. – templatetypedef Feb 25 '11 at 04:38
0

If you like to find Jan in John than have a look at the PHP functions soundex and metaphone. This functions converts a string in a phonetic string. At http://php.net are some examples you could easily convert to JS. You are glad - English has no special characters.

Make a second array with this phonetic equalization and add a pointer to the source element. You need to multisort the second array. https://stackoverflow.com/a/9374631/817152

Translate the search word too.

Then use the interval search algorithm to be fast. https://stackoverflow.com/a/16371484/817152

Don't give up.

Community
  • 1
  • 1
B.F.
  • 477
  • 6
  • 9
0

You can use the Autocomplete of the Jquery UI framework. You'll find the documentation here.
It will avoid you to remake the wheel...

Spilarix
  • 1,418
  • 1
  • 13
  • 24
  • 10
    jQuery is not the solution here. It's a small app that doesn't need an entire js library tacked on. Plus, I want to learn to actually learn how to solve the problem. – doremi Feb 24 '11 at 23:39