I am doing tests with elastic search in indexing wikipedia's topics.
Below my settings.
Results I expect is to have first result matching the exact string - especially if string is made by one word only.
Instead:
Searching for "g"
curl "http://localhost:9200/my_index/_search?q=name:g&pretty=True"
returns [Changgyeonggung, Lopadotemachoselachogaleokranioleipsanodrimhypotrimmatosilphioparaomelitokatakechymenokichlepikossyphophattoperisteralektryonoptekephalliokigklopeleiolagoiosiraiobaphetraganopterygon, ..] as first results (yes, serendipity time! that is a greek dish if you are curious [http://nifty.works/about/BgdKMmwV6B3r4pXJ/] :)
I thought because the results weight more "G" letters respect to other words.. but:
Searching for "google":
curl "http://localhost:9200/my_index/_search?q=name:google&pretty=True"
returns
[Googlewhack, IGoogle, Google+, Google, ..] as first results, and I would expect Google to be the first.
What is wrong in my settings for not hitting exact keyword if exists?
I used index and search analyzers for the reason suggested in this answer:[https://stackoverflow.com/a/15932838/305883]
Settings
# make index with mapping
curl -X PUT localhost:9200/test-ngram -d '
{
"settings": {
"analysis": {
"analyzer": {
"index_analyzer": {
"type" : "custom",
"tokenizer": "lowercase",
"filter": ["asciifolding", "title_ngram"]
},
"search_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": ["standard", "lowercase", "stop", "asciifolding"]
}
},
"filter": {
"title_ngram" : {
"type" : "nGram",
"min_gram" : 1,
"max_gram" : 10
}
}
}
},
"mappings": {
"topic": {
"properties": {
"name": {
"type": "string",
"boost": 10.0,
"index": "analyzed",
"index_analyzer": "index_analyzer",
"search_analyzer": "search_analyzer"
}
}
}
}
}
'