I am currenly working on something where i connect to an Elasticsearch server/database/cluster, whatever the technical term is, and my goal is to grab all the logs in the last 24 hours for parsing. I can grab logs right now but it only grabs a max of 10,000. For reference, within the last 24 hours there have been about 10 million logs total within the database that I am using.
For the python, I make a http request to elasticsearch using the requests library. My current query only has the paramteter size = 10,000.
I am wondering what method/ what query to use for this case? I have seen things about a scroll id or point in time API, but i am not sure what is the best for my case since there are so many logs.
I have just tried increasing the size to a lot more but that does not work well since there are so many logs and it errors out.