I'm currently working on an Search Engine implementation with Scrapy as Crawler and Elasticsearch as server. Scrapy and Elasticsearch work fine, but what I'm currently struggling to enable is case-insensitive search with a german analyzer. I have a general structure (match_all query) like this:
"hits": {
"total": 14,
"max_score": 0.40951526,
"hits": [
{
"_index": "uni",
"_type": "items",
"_id": "AVcuHuT6qni1Wq78foIA",
"_score": 0.40951526,
"_source": {
"description": "...",
"tags": [
"...",
"..."
],
"url":"...",
"author": "...",
"content": "...",
"date": "18.09.2015",
"title": "..."
},
"highlight": {
"content": [
"...",
"...",
"..."
]
}
}
]
}
And tried to add these settings just by "curl -XPUT localhost:9200/uni {...}":
{
"mappings":{
"_source":{
"type":"object",
"properties":{
"title":{
"type":"string",
"analyzer":"german_lowercase"
},
"content":{
"type":"string",
"analyzer":"german_lowercase"
},
"description":{
"type":"string",
"analyzer":"german_lowercase"
},
"tags":{
"type":"array",
"analyzer":"german_lowercase"
}
}
}
},
"settings":{
"uni":{
"analysis":{
"analyzer":{
"german_lowercase":{
"type":"custom",
"tokenizer":"keyword",
"filter":[
"lowercase",
"german_stop",
"german_keywords",
"german_normalization",
"german_stemmer"
]
}
},
"filter":{
"german_stop": {
"type": "stop",
"stopwords": "_german_"
},
"german_keywords": {
"type": "keyword_marker",
"keywords": []
},
"german_stemmer": {
"type": "stemmer",
"language": "light_german"
}
}
}
}
}
}
I'm not sure whats going wrong, can anybody help out?
EDIT: Elasticsearch wont allow me to put those settings into the index (already exists) and if I try to put the mapping separately Ill get an "missing mapping type" exception. In case of the settings it failes ot update the non dynamic settings. So I'm asking for a more general info how I should update those settings/mappings in order to enable case-insensitive search (other posts have the same problem).