0

I have documents that contains a object which the attributes are editable (add/delete/edit) in runtime.

{
  "testIndex" : {
    "mappings" : {
      "documentTest" : {
        "properties" : {
          "typeTestId" : {
            "type" : "string",
            "index" : "not_analyzed"
          },
          "createdDate" : {
            "type" : "date",
            "format" : "dateOptionalTime"
          },
          "designation" : {
            "type" : "string",
            "fields" : {
              "raw" : {
                "type" : "string",
                "index" : "not_analyzed"
              }
            }
          },
          "id" : {
            "type" : "string",
            "index" : "not_analyzed"
          },
          "modifiedDate" : {
            "type" : "date",
            "format" : "dateOptionalTime"
          },
          "stuff" : {
            "type" : "string"
          },
          "suggest" : {
            "type" : "completion",
            "analyzer" : "simple",
            "payloads" : true,
            "preserve_separators" : true,
            "preserve_position_increments" : true,
            "max_input_length" : 50,
            "context" : {
              "typeTestId" : {
                "type" : "category",
                "path" : "typeTestId",
                "default" : [ ]
              }
            }
          },
          "values" : {
            "properties" : {
              "Att1" : {
                "type" : "string"
              },
              "att2" : {
                "type" : "string"
              },
              "att400" : {
                "type" : "date",
                "format" : "dateOptionalTime"
              }
            }
          }
        }
      }
    }
  }
}

The field values is a object that can be edited throug typeTest, so if I change something in typeTestit should be reflected here. If i create a new field theres no problem, but it should be possible to edit or delete existing fields in typeTest. For example If I delete values.att1 all documentTest should lose these, as well as the mapping should be updated.

For what I saw, we cannot do these without reindexing. So for now my solution is to remove the fields in elastic search just like mentioned in this question and have a worker do the reindexing time to time if needed.

This does not seems to me a "solution". Is there a better way to have document of this type in elasticsearch? with this flexibility without having to reindex time to time?

Community
  • 1
  • 1
dege
  • 2,824
  • 2
  • 25
  • 33

1 Answers1

1

You can use the Update API to delete, add or modify a field.
The issue is docs are immutable in elasticsearch, so when you make some changes with the update API it is executed in a manner mark as deleted to old one and add a new one with the updates.
The deletion and the creating the new documents is transparent to you, so you do not have to reindex or do any other thing. Down side is if you are planning to modify very large numbers of documents (like an update query to modify 5mil documents.) it will be very I/O intensive for the nodes.
BTW, this is also applies to deletions

Hkntn
  • 356
  • 1
  • 6
  • my main consurne is the garbage that may be in the mapping of docummentTest. With the updates of documentTest the way I have, I only will need to to a major update with deletions, I'm planing to do a scheduler , for example update 10000 each hour until its done, since I'm getting only the existing fields typeTest document it will be no problem. But this adding and deletions will may the mapping of documentTest have a lot of fields that may be unused :/ – dege Feb 21 '16 at 09:19
  • I did not get you fully but I think you should not schedule your updates, because updates that are spread around is better than a single 10K document update. Merger threadpool will handle the rest(actually deleting the docs) when merging lucene segments, – Hkntn Feb 21 '16 at 16:21
  • When I said deletions I was refering to a "sub field" of the field values(like I mentioned in my question). To help spread the updates (that will delete the "sub field" of the documents), I was planning to use search and scroll like here: https://www.elastic.co/guide/en/elasticsearch/client/javascript-api/current/api-reference-2-2.html#api-scroll-2-2. If I understood what you said, I don't need to schedule the scroll myself, the elasticsearch threadpoll whill do for me. Thanks for the help – dege Feb 22 '16 at 10:03
  • Whether you delete a field or edit e field the document will change, and documents in elasticsearch is immutable, so any modification will result creation of a new document and deletion of the old one. That is the deletion I was reffering. – Hkntn Feb 22 '16 at 11:20