I am trying to fetch and process all entries in elastic using elasticsearch in python. There are approx. 60M records and the issue I have is that when I increase the size above 1M it starts returning nothing.
from elasticsearch import Elasticsearch
es = Elasticsearch("1.1.1.1:1234")
res = es.search(body={
"from": 0,
"size": 10000,
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "_exists_:my_string",
"fields": []
}
}
],
"filter": [
{
"bool": {
"must": [
{
"range": {
"timestamp": {
"from": "2019-11-01 01:45:00.000",
"to": "2019-11-05 07:45:00.300",
}
}
}
]
}
}
]
}
}
})
print("%d documents found" % res['hits']['total'])
I want to convert the results (basically JSON) to pandas data frame. This works well, but I am struggling how to either fetch all records at once or do this in iterations.