0

I'd like to search terms (GoogleEarth or googleearch) using elasticSearch.
Now if I tried to search query 'Google', I cannot get any results without NGram or EdgeNGram.
I don't want to use nGram because they get a lot of results. So now I just use Bool Query + multimatchquery. At this case, I cannot get results by partial words. I hope I can search 'Google Earth' or 'Google' or 'Earth' to get GoogleEarth. How can I get this?

Now I just use query 'GoogleEarth' to get right result. I want to search terms if they included.

 .setQuery(QueryBuilders.boolQuery().should(QueryBuilders.multiMatchQuery(query,
                               'title','name','tag')))

update

I tried to search terms based on exact match. If I search 'google', i want to get 'google***' 'googleearth' and so on. I know if I use edgeNGram or nGram, i may get less related results. So if possible, I don't want to use nGram or edgeNGram. Do you have any ideas?

Soo
  • 387
  • 2
  • 10
  • 19

1 Answers1

1

I think you need to define a custom analyzer to tokenize words based on camel case - i.e. "GoogleEarth" needs to be tokenized into the parts "Google" and "Earth".

See the camelcase tokenizer section of http://www.elasticsearch.org/guide/reference/index-modules/analysis/pattern-analyzer/

nickdos
  • 8,348
  • 5
  • 30
  • 47
  • Thank you for your reply. I tried to search terms based on exact match. If I search 'google', i can get 'google***' 'googleearth' and so on. I know if I use edgeNGram, i can get like this. But I also should get less related terms. Do you have any idea? – Soo Jul 25 '13 at 07:03
  • I don't understand your question. I suggest you update your question and provide numerous exact examples of queries and the source text you expect to match. Also expand what you mean by "less related terms". – nickdos Jul 25 '13 at 07:13
  • I just mean I may get a lot of less related results by nGram. If I query 'google', i may have 'googa', 'goooo'. – Soo Jul 25 '13 at 07:45
  • Too many variables to know - what is your mapping and settings? What filters or anayzers are you using to both index and search? If you use ngram only during indexing then (I think) a search for google will only return hits that contain the text "google" either in the middle start or end but they will not find "googa", etc. See this Q: http://stackoverflow.com/questions/6467067/how-to-search-for-a-part-of-a-word-with-elasticsearch - note how the poster provides lots of code details about their setup and lots of examples - you need to do this to get a decent answer. – nickdos Jul 25 '13 at 22:17