2

I am new to elasticsearch and I need to optimize a python client to do the search/indexing on a elasticsearch cluster. It seems to me that the bottleneck is the client itself, and that elasticsearch can handle more queries. I would like to know how I can make my program more optimal to enhance performance. Should I use multi-processing or multi-threading or there is a more elegant way to do the work. Thank you

ZianyD
  • 171
  • 2
  • 12

1 Answers1

3

If your ES server can easily handle multiple request you can use a ThreadPoolExecutor in order to run multiple queries concurrently.

As the operation is mainly IO driven, using threads should be enough.

noxdafox
  • 14,439
  • 4
  • 33
  • 45
  • Apache Spark has a Python interface to ES. There is a tutorial on it at http://blog.qbox.io/building-an-elasticsearch-index-with-python and http://blog.qbox.io/elasticsearch-in-apache-spark-python. –  Aug 02 '15 at 13:06