1

I'm currently working on an Search Engine implementation with Scrapy as Crawler and Elasticsearch as server. Scrapy and Elasticsearch work fine, but what I'm currently struggling to enable is case-insensitive search with a german analyzer. I have a general structure (match_all query) like this:

"hits": {
    "total": 14,
    "max_score": 0.40951526,
    "hits": [
        {
            "_index": "uni",
            "_type": "items",
            "_id": "AVcuHuT6qni1Wq78foIA",
            "_score": 0.40951526,
            "_source": {
                "description": "...",
                "tags": [
                    "...",
                    "..."
                ],
                "url":"...",
                "author": "...",
                "content": "...",
                "date": "18.09.2015",
                "title": "..."
            },
            "highlight": {
                "content": [
                    "...",
                    "...",
                    "..."
                ]
            }
        }
    ]
}

And tried to add these settings just by "curl -XPUT localhost:9200/uni {...}":

{
    "mappings":{
        "_source":{
            "type":"object",
            "properties":{
                "title":{
                    "type":"string",
                    "analyzer":"german_lowercase"
                },
                "content":{
                    "type":"string",
                    "analyzer":"german_lowercase"
                },
                "description":{
                    "type":"string",
                    "analyzer":"german_lowercase"
                },
                "tags":{
                    "type":"array",
                    "analyzer":"german_lowercase"
                }
            }
        }
    },
    "settings":{
        "uni":{
            "analysis":{
                "analyzer":{
                    "german_lowercase":{
                        "type":"custom",
                        "tokenizer":"keyword",
                        "filter":[
                            "lowercase",
                            "german_stop",
                            "german_keywords",
                            "german_normalization",
                            "german_stemmer"
                        ]
                    }
                },
                "filter":{
                    "german_stop": {
                        "type": "stop",
                        "stopwords": "_german_"
                    },
                    "german_keywords": {
                        "type": "keyword_marker",
                        "keywords": []
                    },
                    "german_stemmer": {
                        "type": "stemmer",
                        "language": "light_german"
                    }
                }
            }
        }
    }
}

I'm not sure whats going wrong, can anybody help out?

EDIT: Elasticsearch wont allow me to put those settings into the index (already exists) and if I try to put the mapping separately Ill get an "missing mapping type" exception. In case of the settings it failes ot update the non dynamic settings. So I'm asking for a more general info how I should update those settings/mappings in order to enable case-insensitive search (other posts have the same problem).

nomad
  • 95
  • 1
  • 8

0 Answers0