0
var settings = new ConnectionSettings(Constants.ElasticSearch.Node);
var client = new ElasticClient(settings);

var response = client.Search<DtoTypes.Customer.SearchResult>(s =>
    s.From(0)
    .Size(100000)
    .Query(q => q.MatchAll()));

It works when the size is smaller, but I want to retrieve all documents in an index that has over 100k documents. Must be a configuration setting I'm missing to get around a limit. I've also tried Take() instead of Size()

The Debug Info returned back is

"Invalid NEST response built from a unsuccesful low level call on POST: /_search\r\n# Audit trail of this API call:\r\n - BadResponse: Node: http://127.0.0.1:9200/ Took: 00:00:00.2964038\r\n# ServerError: ServerError: 500Type: search_phase_execution_exception Reason: \"all shards failed\"\r\n# OriginalException: System.Net.WebException: The remote server returned an error: (500) Internal Server Error.\r\n at System.Net.HttpWebRequest.GetResponse()\r\n at Elasticsearch.Net.HttpConnection.Request[TReturn](RequestData requestData) in C:\users\russ\source\elasticsearch-net\src\Elasticsearch.Net\Connection\HttpConnection.cs:line 138\r\n# Request:\r\n\r\n# Response:\r\n\r\n"

Chris Klepeis
  • 9,783
  • 16
  • 83
  • 149
  • The debug info makes it seem like something is going on within elasticsearch during the query, not so much within nest. Did you try just running it through the normal search API? You should be able to just grab the query from the nest response as well from `response.RequestInformation` – BenM Mar 30 '16 at 14:18
  • Maybe [this](http://stackoverflow.com/questions/27955623/is-there-a-way-to-retrieve-all-records-in-a-elasticsearch-nest-query) answer will help you. – Rob Mar 30 '16 at 14:22
  • If you `.DisableDirectStreaming()` on `ConnectionSettings`, you'd be able to see the Request and Response in `DebugInformation` too (_you'll probably only want to use `.DisableDirectStreaming()` whilst in development and not in production_) – Russ Cam Mar 30 '16 at 20:36

1 Answers1

1

Elasticsearch has a soft limit on the amount of results it allows to return. If you want more then 10.000 results in one go, you should use the scan and scroll functionality :)

From the Elasticsearch documentation:

"Note that from + size can not be more than the index.max_result_window index setting which defaults to 10,000. See the Scroll API for more efficient ways to do deep scrolling."

Reference:

https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-from-size.html https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-scroll.html https://nest.azurewebsites.net/nest/search/scroll.html

Byron Voorbach
  • 4,365
  • 5
  • 27
  • 35
  • I'll try that out. In my old tests using Nest I set Take(100000) and it worked, so I must have set that setting to increase the value From your link above, I also see "Scrolling is not intended for real time user requests, but rather for processing large amounts of data". I may end up having an in memory cache of the entirety of my data to pull it all, then just use elastic for actual searching – Chris Klepeis Mar 30 '16 at 14:39
  • Were your old tests running against an older version of Elasticsearch?? They introduced the limit somewhere in 2.x if I remember correctly. I'm not sure what your use-case is, but do you need 100.000 results to be returned at once?? – Byron Voorbach Mar 30 '16 at 14:41
  • I thought it was 2.x. I probably set index.max_result_window_index in the config file to something ridiculous. Yeah... what the owner of the company wants, the owner of the company gets :) I've already made my case against it. – Chris Klepeis Mar 30 '16 at 14:44
  • Clients.. ;) Good luck with your project! – Byron Voorbach Mar 30 '16 at 14:51