25

When doing a search, Elasticsearch returns a data structure that contains various meta information.

The actual result set is contained within a "hits" field within the JSON result returned from the database.

Is it possible for Elasticsearch to return only the needed data (the contents of then "hits" field) without being embedded within all the other meta data?

I know I could parse the result into JSON and extract it, but I don't want the complexity, hassle, performance hit.

thanks!

Here is an example of the data structure that Elasticsearch returns.

{
    "_shards":{
        "total" : 5,
        "successful" : 5,
        "failed" : 0
    },
    "hits":{
        "total" : 1,
        "hits" : [
            {
                "_index" : "twitter",
                "_type" : "tweet",
                "_id" : "1", 
                "_source" : {
                    "user" : "kimchy",
                    "postDate" : "2009-11-15T14:12:12",
                    "message" : "trying out Elastic Search"
                }
            }
        ]
    }
}
Duke Dougal
  • 24,359
  • 31
  • 91
  • 123
  • I believe that being able to control what ES returns is an important feature. For example, if one wants to incorporate results returned from ES into a reproducible document. – Dror Sep 10 '14 at 11:40
  • 1
    Possible duplicate of [Filter out metadata fields and only return source fields in elasticsearch](http://stackoverflow.com/questions/23283033/filter-out-metadata-fields-and-only-return-source-fields-in-elasticsearch) – The Demz Mar 01 '16 at 18:39
  • duplicate: https://stackoverflow.com/questions/43772834/need-to-return-source-fields-only-without-any-metadata-how-to-use-plugin?noredirect=1&lq=1 – Sandeep Kanabar Jan 03 '18 at 17:10

2 Answers2

30

You can at least filter the results, even if you cannot extract them. The "common options" page of the REST API explains the "filter_path" option. This lets you filter only the portions of the tree you are interested in. The tree structure is still the same, but without the extra metadata.

I generally add the query option:

&filter_path=hits.hits.*,aggregations.*

The documentation doesn't say anything about this making your query any faster (I doubt that it does), but at least you could return only the interesting parts.

  • Corrected to show only hits.hits.*, since the top level "hits" has metadata as well.
Michael Erickson
  • 3,881
  • 2
  • 20
  • 16
  • Just to add more information: filter_path (response filtering) doesn't work for the version 1.5 of elasticsearch. Unless moved in the documentation or renamed, it was first added in version 1.6: https://www.elastic.co/guide/en/elasticsearch/reference/1.6/common-options.html#_response_filtering – Jorge May 24 '18 at 10:08
12

No, it's not possible at this moment. If performance and complexity of parsing are the main concerns, you might want to consider using different clients: java client or Thrift plugin, for example.

imotov
  • 28,277
  • 3
  • 90
  • 82
  • 3
    "No, it's not possible at this moment"... as the moment changed does the answer should/can be updated? :) – Dror Jul 11 '14 at 11:14
  • I would need this feature too, e.g. I have stored an HTML page in one of my Elasticsearch document and I want to retrieve the *content* without any metadata – yancheelo Jun 15 '22 at 14:17