1

I am ingesting csv data into elasticsearch using the append processor. I already have two fields that are objects (object1 and object2) and I want to append them both into an array of a different field (mainlist). So it would come out as mainlist:[ {object1}, {object}] I have tried the set processor with the copy_from parameter and I am getting an error that I am missing the required property name "value" even though the ElasticSearch documentation clearly doesn't use the "value" property when it uses the "copy_from". {"set": {"field": "mainlist","copy_from": ["object1", "object"]}}. My syntax is even copied exactly from the documentation. Please help.

Furthermore I need to drop empty fields at the ingest level so they are not returned. I don't wish to have "fieldname: "", returned to the user. What is the best way to do that. I am new to ElasticSearch and it has not been going well.

Dimeji Olayinka
  • 71
  • 3
  • 12

1 Answers1

1

As to dropping the empty fields at ingest level -- set up a pipeline:

PUT _ingest/pipeline/no_empty_fields
{
  "description": "Removes empty-ish fields from a doc",
  "processors": [
    {
      "script": {
        "source": """
          def keys_to_remove = ctx.keySet()
                          .stream()
                          .filter(field -> ctx[field] == null || 
                                           ctx[field] == "")
                          .collect(Collectors.toList());

          for (key in keys_to_remove) {
            ctx.remove(key);
          }
        """
      }
    }
  ]
}

and apply it upon indexing

POST myindex/_doc?pipeline=no_empty_fields
{
  "fieldname23": 123,
  "fieldname": null,
  "fieldname123": ""
}

You can of course extend the conditions to ditch other fields such as "undefined", "Infinity" and others.

Joe - GMapsBook.com
  • 15,787
  • 4
  • 23
  • 68