0

I am creating the mappings for an index I will be using in a project. Given the domain of the features, I'd like most of the fields to be searchable through case-insensitive term queries. I had worked through a custom analyzer (like the one suggested here: Elasticsearch Map case insensitive to not_analyzed documents) but when I try to index a document, the process hangs for 60 seconds until a timeout happens and the whole process fails. I see the same behavior when I test on Sense.

Here is the index definition:

put /emails
{
   "mappings": {
      "email": {
         "properties": {
            "createdOn": {
               "type": "date",
               "store": true,
               "format": "strict_date_optional_time||epoch_millis"
            },
            "data": {
               "type": "object",
               "dynamic": "true"
            },
            "from": {
               "type": "string",
               "store": true
            },
            "id": {
               "type": "string",
               "store": true
            },
            "sentOn": {
               "type": "date",
               "store": true,
               "format": "strict_date_optional_time||epoch_millis"
            },
            "sesId": {
               "type": "string",
               "store": true
            },
            "subject": {
               "type": "string",
               "store": true,
               "analyzer": "standard"
            },
            "templates": {
               "properties": {
                  "html": {
                     "type": "string",
                     "store": true
                  },
                  "plainText": {
                     "type": "string",
                     "store": true
                  }
               }
            },
            "to": {
               "type": "string",
               "store": true
            },
            "type": {
               "type": "string",
               "store": true
            }
         }
      },
      "event": {
         "_parent": {
            "type": "email"
         },
         "properties": {
            "id": {
               "type": "string",
               "store": true
            },
            "origin": {
               "type": "string",
               "store": true
            },
            "time": {
               "type": "date",
               "store": true,
               "format": "strict_date_optional_time||epoch_millis"
            },
            "type": {
               "type": "string",
               "store": true
            },
            "userAgent": {
               "type": "string",
               "store": true
            }
         }
      }
   },
   "settings": {
      "number_of_shards": "5",
      "number_of_replicas": "0",
      "analysis": {
         "analyzer": {
            "default": {
               "tokenizer": "keyword",
               "filter": [
                  "lowercase"
               ],
               "type": "custom"
            }
         }
      }
   }
}

As you can see, I define an analyzer as "default" (if I try to use another name and define it as a default analyzer for each of the two types, I get a "Root mapping definition has unsupported parameters: [analyzer : my_analyzer]" error).

And this is me trying to add a document to the index

post /emails/email/1
{
    "from": "email-address-1",
    "to": "email-address-2",
    "subject": "Hello world",
    "data":{
        "status": "SENT"
    }
}

I really can't understand why this timeout is happening. I also tried using NEST via a C# console application. Same behavior.

Thanks.

PS: for testing I am using both Elasticsearch 2.3 hosted by AWS and Elasticsearch 2.3 hosted in a local docker container.

Community
  • 1
  • 1
Kralizek
  • 1,999
  • 1
  • 28
  • 47
  • Do you have enough nodes in the cluster to make having 5 replicas worthwhile at this stage in development? – Russ Cam Dec 10 '16 at 01:02
  • It's a go-to-production cluster made temporarily by a single node. At the moment is as empty as creating a new cluster can make it empty. – Kralizek Dec 10 '16 at 01:06
  • I think that should be 5 shards and 1 replica. While you're developing, you can set replicas to 0 and then update that before moving to production. – Russ Cam Dec 10 '16 at 01:19
  • I noticed the index definition above is not the one giving me problems. I update the question. – Kralizek Dec 10 '16 at 01:43
  • Changing your index to 5 shards and 1 replica solves the problem. The issue with 5 replicas and 1 primary shard on one node is that there are not enough active copies to meet the default quorum write consistency of 4 (`n/2 +1`), since the replicas will all be unassigned on the single node. You'll see a `UnavailableShardsException` in the logs with an error message for this. – Russ Cam Dec 10 '16 at 01:46
  • I was just going to write that lowering the replica shards to 1 solved the issue. Do you also have an idea about the `"Root mapping definition has unsupported parameters: [analyzer : my_analyzer]"` error? – Kralizek Dec 10 '16 at 01:51
  • I added the definition here: pastebin.com/VjY5nM0P – Kralizek Dec 10 '16 at 01:53
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/130266/discussion-between-kralizek-and-russ-cam). – Kralizek Dec 10 '16 at 01:54
  • This is now a different question! It would have been best to close/delete this question and add a new one :) – Russ Cam Dec 10 '16 at 01:54
  • Indeed. Add your solution to this post so I can mark it as "accepted answer" and give your help the proper visibility while I prepare the other post. – Kralizek Dec 10 '16 at 01:56

1 Answers1

1

The problem is that you have 1 node and an index with 1 primary shard and 5 replica shards.

Since replicas of a primary will not be assigned on the same node as the primary, the 5 replicas will all be unassigned. This is an issue when indexing a document; by default, the write consistency for an index operation is quorum, and a quorum of 6 (1 primary + 5 replicas) is 4 (n/2 + 1). This means the document needs to have been written to the primary and 3 replicas of the same shard in order to succeed. With unassigned shards, it won't be possible to satisfy this. You'll see a UnavailableShardsException in the logs with an error message for this.

Changing your index to 5 shards and 1 replica will solve the problem.

Russ Cam
  • 124,184
  • 33
  • 204
  • 266