1

The target is to query specific fields from an index via a spring boot app.

Questions in the end.

The data in elasticsearch are created from Elastic Stack with Beats and Logstash etc. There is some inconsistency, eg some fields may be missing on some hits.

The spring app does not add the data and has no control on the fields and indexes

The query I need, with _source brings

GET index-2022.07.27/_search
{
  "from": 0,
  "size": 100,
  "_source": ["@timestamp","message", "agent.id"],
  "query": {
      "match_all": {}
  }
}

brings the hits as

  {
    "_index": "index-2022.07.27",
    "_id": "C1zzPoIBgxar5OgxR-cs",
    "_score": 1,
    "_ignored": [
      "event.original.keyword"
    ],
    "_source": {
      "agent": {
        "id": "ddece977-9fbb-4f63-896c-d3cf5708f846"
      },
      "@timestamp": "2022-07-27T09:18:27.465Z",
      "message": """a message"""
    }
  },

and with fields instead of _source is

{
    "_index": "index-2022.07.27",
    "_id": "C1zzPoIBgxar5OgxR-cs",
    "_score": 1,
    "_ignored": [
      "event.original.keyword"
    ],
    "fields": {
      "@timestamp": [
        "2022-07-27T09:18:27.465Z"
      ],
      "agent.id": [
        "ddece977-9fbb-4f63-896c-d3cf5708f846"
      ],
      "message": [
        """a message"""
      ]
    }
},
  1. How can I get this query with Spring Boot ?

I lean on StringQuery with the RestHighLevelClient as below but cant get it to work

        Query searchQuery = new StringQuery("{\"_source\":[\"@timestamp\",\"message\",\"agent.id\"],\"query\":{\"match_all\":{}}}");

        SearchHits<Items> productHits = elasticsearchOperations.search(
                searchQuery,
                Items.class,
                IndexCoordinates.of(CURRENT_INDEX));
  1. What form must Items.class have? What fields?

I just need timestamp, message, agent.id. The later is optional, it may not exist.

  1. How will the mapping work?

versions:

  • Elastic: 8.3.2
  • Spring boot: 2.6.6
  • elastic (mvn): 7.15.2
  • spring-data-elasticsearch (mvn): 4.3.3

official documentation states that with RestHighLevelClient the versions should be supported

Support for upcoming versions of Elasticsearch is being tracked and general compatibility should be given assuming the usage of the high-level REST client.

thahgr
  • 718
  • 1
  • 10
  • 27

1 Answers1

2

You can define an entity class for the data you want to read (note I have a nested class for the agent):

@Document(indexName = "index-so", createIndex = false)
public class SO {
    @Id
    private String id;

    @Field(name = "@timestamp", type = FieldType.Date, format = DateFormat.date_time)
    private Instant timestamp;

    @Field(type = FieldType.Object)
    private Agent agent;

    @Field(type = FieldType.Text)
    private String message;

    public String getId() {
        return id;
    }

    public void setId(String id) {
        this.id = id;
    }

    public Instant getTimestamp() {
        return timestamp;
    }

    public void setTimestamp(Instant timestamp) {
        this.timestamp = timestamp;
    }

    public Agent getAgent() {
        return agent;
    }

    public void setAgent(Agent agent) {
        this.agent = agent;
    }

    public String getMessage() {
        return message;
    }

    public void setMessage(String message) {
        this.message = message;
    }

    class Agent {
        @Field(name = "id", type = FieldType.Keyword)
        private String id;

        public String getId() {
            return id;
        }

        public void setId(String id) {
            this.id = id;
        }
    }
}

The query then would be:

var query = new NativeSearchQueryBuilder()
    .withQuery(matchAllQuery())
    .withSourceFilter(new FetchSourceFilter(
        new String[]{"@timestamp", "message", "agent.id"}, 
        new String[]{}))
    .build();
var searchHits = operations.search(query, SO.class);
P.J.Meisch
  • 18,013
  • 6
  • 50
  • 66
  • This is it indeed! I figured out the nested class in the SO , just like configuration properties. Indeed the query with source filter brings only the required fields reducing the network message size by 90% !! I hadnt found an example for the filtering thats why I was trying it with the SearchQuery. Thanks for taking the time to answer with details! – thahgr Jul 28 '22 at 11:07
  • I coudnt get the `Instant` to work as the timestamps have some inconsistencies, I will try to fix that in the `Logstash` instances. For the moment with `String` it works fine. – thahgr Jul 28 '22 at 11:11
  • Could you provide a snippet that only queries the number of hits i have found this, if you agree. https://stackoverflow.com/questions/72042661/how-to-get-total-hits-for-an-es-query-after-setting-from-and-size-field – thahgr Jul 28 '22 at 13:12
  • never mind, got it with `Long totalCount = elasticsearchOperations.count(searchQuery, IndexCoordinates.of(elasticIndex));` – thahgr Jul 28 '22 at 13:18