8

I wanted to set the request time to 20 sec or more in Elasticsearch Bulk uploads. Default time is set to 10 sec and my Warning message days it takes 10.006 sec. And, right after displaying the waring the execution is throwing an error

Now, I wanted to set the Request Timeout either for every request taking input from user or any value set by default.

Error Message:

    WARNING:elasticsearch:HEAD /opportunityci/predictionsci [status:404 request:0.080s]
validated the index and mapping...!
WARNING:elasticsearch:POST http://192.168.204.154:9200/_bulk [status:N/A request:10.003s]
Traceback (most recent call last):
  File "/Users/adaggula/anaconda/lib/python2.7/site-packages/elasticsearch/connection/http_urllib3.py", line 94, in perform_request
    response = self.pool.urlopen(method, url, body, retries=False, headers=self.headers, **kw)
  File "/Users/adaggula/anaconda/lib/python2.7/site-packages/urllib3/connectionpool.py", line 640, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/Users/adaggula/anaconda/lib/python2.7/site-packages/urllib3/util/retry.py", line 238, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/Users/adaggula/anaconda/lib/python2.7/site-packages/urllib3/connectionpool.py", line 595, in urlopen
    chunked=chunked)
  File "/Users/adaggula/anaconda/lib/python2.7/site-packages/urllib3/connectionpool.py", line 395, in _make_request
    self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
  File "/Users/adaggula/anaconda/lib/python2.7/site-packages/urllib3/connectionpool.py", line 315, in _raise_timeout
    raise ReadTimeoutError(self, url, "Read timed out. (read timeout=%s)" % timeout_value)
ReadTimeoutError: HTTPConnectionPool(host='192.168.204.154', port='9200'): Read timed out. (read timeout=10)
ERROR:DataScience:init exception : Traceback (most recent call last):
  File "/Users/adaggula/Documents/workspace/LatestDemo/demo/com/ci/dataScience/engine/Driver.py", line 194, in <module>
    sample.persist(finalResults)
  File "/Users/adaggula/Documents/workspace/LatestDemo/demo/com/ci/dataScience/ES/sample.py", line 68, in persist
    res = helpers.bulk(client,data,stats_only=True)
  File "/Users/adaggula/anaconda/lib/python2.7/site-packages/elasticsearch/helpers/__init__.py", line 188, in bulk
    for ok, item in streaming_bulk(client, actions, **kwargs):
  File "/Users/adaggula/anaconda/lib/python2.7/site-packages/elasticsearch/helpers/__init__.py", line 160, in streaming_bulk
    for result in _process_bulk_chunk(client, bulk_actions, raise_on_exception, raise_on_error, **kwargs):
  File "/Users/adaggula/anaconda/lib/python2.7/site-packages/elasticsearch/helpers/__init__.py", line 89, in _process_bulk_chunk
    raise e
ConnectionTimeout: ConnectionTimeout caused by - ReadTimeoutError(HTTPConnectionPool(host='192.168.204.154', port='9200'): Read timed out. (read timeout=10))
Jack Daniel
  • 2,527
  • 3
  • 31
  • 52

1 Answers1

15

Use parameter 'request_timeout'

E.g.:

bulk(es, records, chunk_size=500, request_timeout=20)
Anthon
  • 69,918
  • 32
  • 186
  • 246
Lyncean Patel
  • 2,433
  • 16
  • 15
  • The bulk method does not use request_timeout. Only scan does. – MarkM Jan 14 '19 at 08:42
  • 2
    @MarkM It worked for me and I saw the source code, it has request_timeout parameter. I posted this answer in 2017, might be libraries has been updated and now they have some other option. – Lyncean Patel Jan 18 '19 at 12:35
  • 1
    Apologies, I was wrong. Through `**kwargs` the `request_timeout` is passed along a couple of times, until [client.utils.query_params](https://github.com/elastic/elasticsearch-py/blob/6.3.1/elasticsearch/client/utils.py#L73) consumes and adds it to `params`. – MarkM Jan 19 '19 at 15:46
  • `chunk_size` is not a valid argument for the current version of ES. – Mats May 11 '20 at 11:33