1

For the life of me I can't find any reference to using the ElasticSearch scroll api from within Ruby on Rails and the elastisearch-model (or rails or dsl) gem.

The only thing they do reference in the docs is calling scroll directly on the client, which kind of defeats the purpose. Also, it does not use the client or any client settings you've already set in your Rails app.

I want to do something like this.

Here is the ElasticSearch query that works from within the Kibana Dev Tools:

GET model_index/_search?scroll=1m
      {
        "size": 100,
        "query": {
          "match": {
            "tenant_id": 3196
          }
        },
        "_source": "id"
      }

I would have thought that I could call something like

MyModel.search scroll: '1m', ...

but instead it seems like I need to do:

# First create a client by hand
client = Elasticssearch::Client.new    
result = client.search index: 'model_index',
scroll: '1m',
body: { query: { match: { tenant_id: 3196 } }, sort: '_id' }

Does anyone have any more user-friendly examples?

phil
  • 4,668
  • 4
  • 33
  • 51

1 Answers1

0

As per elasticsearch guide -

We no longer recommend using the scroll API for deep pagination. If you need to preserve the index state while paging through more than 10,000 hits, use the search_after parameter with a point in time (PIT).

Ref - https://www.elastic.co/guide/en/elasticsearch/reference/7.x/scroll-api.html

Further edit for above question - To scroll on document need to use scroll_id from result, to get next set of result.

body = { query: { match: { tenant_id: 3196 } }, sort: '_id' }

response = Elasticsearch::Client.new.search(
  index: 'model_index', 
  scroll: "1m", 
  body: body, 
  size: 3000
)

loop do
  hits = response.dig('hits', 'hits')
  break if hits.empty?

  hits.each do |hit|
    # do something
  end

  response = Elasticsearch::Client.new.scroll(
    :body => { :scroll_id => response['_scroll_id'] }, 
    :scroll => '1m'
  )
end
Sandip Karanjekar
  • 850
  • 1
  • 6
  • 23