I want to understand how google handles no space between 2 words. For example there are 2 words - word1 and word2. I write in search box 'word1word2', it says do you mean 'word1 word2' or just understands to look for 'word1 word2'. Any information what data structure and algorithm they use? I see in this answer How to split text without spaces into list of words?, it is suggested to use trie data structure.
-
It would be best to ask a google developer. – Jul 13 '12 at 13:03
-
1this is not about data structures, but mainly about statistics and probability estimates – usamec Jul 13 '12 at 13:07
-
1Possible duplicate of [How google split words bunched together (without spaces)?](https://stackoverflow.com/questions/53720647/how-google-split-words-bunched-together-without-spaces) – Fifi Dec 11 '18 at 21:32
2 Answers
In the candidate generation of the spell corrector, you allow as a possibility omission of a space, just as you allow omission of other letters.... Perhaps look at the spelling correction lecture here: http://nlp-class.org/ [sorry, self-promotion] or Peter Norvig's intro: http://norvig.com/spell-correct.html

- 9,360
- 34
- 46
I assume you must have a script (using ajax for exemple http://net.tutsplus.com/tutorials/javascript-ajax/adding-a-jquery-auto-complete-to-your-google-custom-search-engine/)
Basically you check the words in a dictionary. The space must not be a condition to check the word but just a possibility. For exemple a simple algo(really simple) would be : "severalwords" you check the 3 firsts letter, nothing ? Then you check the 4 firsts...
Here is some explanations about google search engine : https://developers.google.com/search-appliance/documentation/60/admin_searchexp/ce_improving_search
Maybe here can help too : http://tm.durusau.net/?cat=1106

- 6,433
- 9
- 48
- 93