I am new to elasticsearch and I need to optimize a python client to do the search/indexing on a elasticsearch cluster. It seems to me that the bottleneck is the client itself, and that elasticsearch can handle more queries. I would like to know how I can make my program more optimal to enhance performance. Should I use multi-processing or multi-threading or there is a more elegant way to do the work. Thank you
Asked
Active
Viewed 2,629 times
1 Answers
3
If your ES server can easily handle multiple request you can use a ThreadPoolExecutor in order to run multiple queries concurrently.
As the operation is mainly IO driven, using threads should be enough.

noxdafox
- 14,439
- 4
- 33
- 45
-
Apache Spark has a Python interface to ES. There is a tutorial on it at http://blog.qbox.io/building-an-elasticsearch-index-with-python and http://blog.qbox.io/elasticsearch-in-apache-spark-python. – Aug 02 '15 at 13:06