I'm looking for ways to cut down the response time on elasticsearch percolation and reducing the CPU utilization while, its being performed.
Tried a bunch of steps and was successful in bring down the response time but, it impacted the CPU utilization. I'm using elasticsearch 5.6 and I'm checking whether, I can get a response time of less than 2 seconds at-least.
The steps are mentioned below:
Tried running the percolator query with 1 node and 1 shard. The response time was very poor. It varied between 40 - 37 seconds.
Tried running the percolator query with 1 node and 3 shards. The response time was better but, not great. It varied between 16 - 14 seconds. This was a scenario, where I attempted over-allocation of shards to see, if that made a difference and though, the response time got better, the CPU utilization was over 90% on a 4 core and 32 GB VM. Memory spike was there but, nothing alarming. I think, memory would become a concern, if consecutive percolator queries would have been attempted.
Tried running the percolator query with 1 node and 10 shards. The response time was better but, not great. It varied between 15 - 13 seconds.
Checked out some links on elasticsearch git discussion and tried reducing the terms but, that started affected the scoring and hence, had to abandon this step as, the scoring and matching should be consistent for the use case, I'm trying out.
Links that, I referred to are mentioned below.
How to improve percolator performance in ElasticSearch?