We have an ElasticSearch instance on Linux in the Azure cloud. We are trying to programmatically obtain a flat file or dump (the format is negotiable) of one of our ElasticSearch indexes once every 24 hours at a specified time, which would then be delivered to a customer, who does not have ElasticSearch. The file would be about 15GB in size, and include approximately 7 million documents.
We are thinking we need to start with a query on our ElasticSearch instance which would actually get the data, however, through my perusal of the documentation, I don’t see such a query to accomplish this.
Is anyone aware of such a query, or methodology to achieve this? In addition to the query, the large size of the file is of concern, and would need to be considered for the correct solution to be achieved.
EDIT: I've added some additional relevant information that was not obvious in the first post that may make the answers differ slightly.