ElasticSearch: Configuring a custom analyzer implementation

Question

Currently i am evaluating if and how a legacy lucene-based analyzer component can be moved to elastic search (0.19.18). Since the legacy code is based on lucene i wrapped the analyzer in an es-plugin. The analyzer's configuration looks like the following lines:

index.analysis.analyzer.myAnalyzer.type : myAnalyzer
index.analysis.analyzer.default.type: myAnalyzer
index.analysis.analyzer.default_index.type: myAnalyzer
index.analysis.analyzer.default_search.type: myAnalyzer

So far so good.

curl -XGET 'localhost:9200/_analyze' -d 'Some text'

Would return an object that contains the correctly tokenized text, but

curl -XGET 'localhost:9200/<name-of-my-index>/_analyze' -d 'Some text'

would return a text, that is not tokenized at all. Obviously, instead of myAnalyzer only the lower case filter is applied. The objects in the index are neither correctly analyzed.

The index mappings look like this (output from head-plugin):

mappings: {
item: {
    analyzer: myAnalyzer
    properties: {
        id: {
            type: string
        }
        itemnumber: {
            type: string
        }
        articletext: {
            analyzer: myAnalyzer
            type: string
        }
        sortvalue: {
            type: string
        }
        salesstatus: {
            format: dateOptionalTime
            type: date
        }
    }
}
}

Since i am new to ES, i can't figure out, what the reason for this behaviour actually is. Is there somebody with an idea?

What do you get when you run `curl -XGET 'localhost:9200//_settings'` ? — imotov, Jul 16 '12 at 18:46
`{"myIndex":{"settings":{"index.version.created":"190899","index.number_of_replicas":"0","index.number_of_shards":"1"}}}` — GLA, Jul 17 '12 at 07:49

score 2 · Answer 1 · answered Nov 12 '12 at 12:15

2

This how I set a custom default analyzer in Elasticsearch.

index:
  analysis:
    analyzer:
      default:
        filter: [lowercase]
        tokenizer: whitespace
        type: custom

Works like a charm.

answered Nov 12 '12 at 12:15

thomax

9,213
3
49
68

I stumbled upon your answer, was trying to use an custom already-defined analyzer in "type" e.g. "french2" and "custom" made it! Thanks. – maxbeaudoin Nov 17 '14 at 20:10
just to emphasis, the key is to use 'default' as the custom analyzer name. – vim Dec 15 '15 at 06:58

ElasticSearch: Configuring a custom analyzer implementation

1 Answers1