3

I am using Elasticsearch - Bonsai in one of my Ruby on Rails Project. So, far things were going very smooth. But, the moment we launched this application to end-users and people started coming-in, we noticed that a single elasticsearch query takes 5-7 seconds to respond (really bad experience for us) -- Though, we have 8-2x Web Dynos in place.

So, we decided to upgrade the Bonsai add-on to Bonsai 10 and also added NewRelic add-on (to keep an eye on how much time a single query takes to respond)

Below are our environment settings:

Ruby: 2.2.4
Rails: 4.2.0
elasticsearch: 1.0.15
elasticsearch-model: 0.1.8

So, we imported the data into Elasticsearch again and here's our ElasticSearch cluster health:

pry(main)> Article.__elasticsearch__.client.cluster.health
=> {"cluster_name"=>"elasticsearch",
    "status"=>"green",
    "timed_out"=>false,
    "number_of_nodes"=>3,
    "number_of_data_nodes"=>3,
    "active_primary_shards"=>1,
    "active_shards"=>2,
    "relocating_shards"=>0,
    "initializing_shards"=>0,
    "unassigned_shards"=>0,
    "delayed_unassigned_shards"=>0,
    "number_of_pending_tasks"=>0,
    "number_of_in_flight_fetch"=>0}

and below is NewRelic's data of ES calls

enter image description here

Which indicates a big reason to worry.

My model article.rb below:

class Article < ActiveRecord::Base
  include Elasticsearch::Model

  after_commit on: [:create] do
    begin
      __elasticsearch__.index_document
    rescue Exception => ex
      logger.error "ElasticSearch after_commit error on create: #{ex.message}"
    end
  end

  after_commit on: [:update] do
    begin
      Elasticsearch::Model.client.exists?(index: 'articles', type: 'article', id: self.id) ? __elasticsearch__.update_document :     __elasticsearch__.index_document
    rescue Exception => ex
      logger.error "ElasticSearch after_commit error on update: #{ex.message}"
    end
  end

  after_commit on: [:destroy] do
    begin
      __elasticsearch__.delete_document
    rescue Exception => ex
      logger.error "ElasticSearch after_commit error on delete: #{ex.message}"
    end
  end

  def as_indexed_json(options={})
    as_json({
      only: [ :id, :article_number, :user_id, :article_type, :comments, :posts, :replies, :status, :fb_share, :google_share, :author, :contributor_id, :created_at, :updated_at ],
      include: {
        posts: { only: [ :id, :article_id, :post ] },
      }
    })
  end
end

Now, if I look at the BONSAI 10 plan of Heroku, it gives me 20 Shards but with current status of cluster, it is using only 1 active primary shards and 2 active shards. Few questions suddenly came into my mind:

  1. Does increasing the number of shards to 20 will help here?
  2. It can be possible to cache the ES queries -- Do you also suggest the same? -- Does it has any Pros and Cons?

Please help me in finding the ways by which I can reduce the time and make ES work more efficient.

UPDATE

Here's the small code snippet https://jsfiddle.net/puneetpandey/wpbohqrh/2/, I had created (as a reference) to show exactly why I need so much calls to ElasticSearch

In the example above, I am showing few counts (in front of each checkbox element). To show those counts, I need to fetch numbers which I am getting by hitting ES

Ok, so after reading the comments and found a good article here: How to config elasticsearch cluster on one server to get the best performace on search I think I've got enough to re-structure upon

Best,

Puneet

Puneet Pandey
  • 541
  • 2
  • 6
  • 23

2 Answers2

1

Nick with Bonsai here. If you get in touch with our support team at support@bonsai.io, we're always happy to help with performance questions, and have access to a lot more detailed logs to help with that. In the mean time I think I can share some sufficiently generic advice here…

In this case, the interesting statistic on your New Relic report is "Average calls (per txn): 109." If I'm understanding that correctly, it looks like your app is averaging over 100 calls to Elasticsearch per web request. That seems unusually high.

If that 3,000ms is averaged across all 100+ requests, then that's around 30 ms per request to Elasticsearch. That's also a bit slower than our usual average, but much more reasonable than 3,000ms for a single request. (We can share more specific numbers with you via more private support correspondence.)

You may want to focus on decreasing the number of Elasticsearch requests. If you can't reduce the total requests, you might consider combining them to save on per-request and per-connection overhead. Bonsai also supports HTTP keep-alive, so you can reuse connections between requests, helping to reduce the overhead of the initial TLS handshake.

For consolidating updates, you can use the Bulk API. There's also the Multi Search API for searches and Multi Get API for single-document get requests.

If reduction and consolidation both aren't possible, then you may have some other use case that's important for making all of those searches individually. If that's the case, I would recommend using Ajax in the UI to post-load those searches. That way your app can serve a fast initial response and show some progress to the user while gradually in filling in the rest.

Nick Zadrozny
  • 7,906
  • 33
  • 38
  • Thanks for the quick reply @Nick.. I totally agree with you that with each Web request, I am hitting ES for 100+ times to get records for each attributes in the page. Reducing the total number of calls certainly helps here and I am also considering your 3rd option as well, wherein I can use/call AJAX requests – Puneet Pandey Mar 06 '16 at 14:49
  • If you've ever done work on the "N+1 Query" problem in your database calls, it sounds like you've created a similar situation for yourself here with Elasticsearch :-) – Nick Zadrozny Mar 08 '16 at 01:38
  • Unfortunately @nick-zadrozny, I don't see any option to reduce the calls for ES. Here's the small snippet [code snippet](https://jsfiddle.net/puneetpandey/wpbohqrh/2/) . As, you'll see to display the count for each attribute in Advanced search, I'll have to query ElasticSearch. Let me know, if that explains! – Puneet Pandey Mar 08 '16 at 06:57
  • Hi Puneet, what you're trying to do is exactly what [Aggregations](https://www.elastic.co/guide/en/elasticsearch/reference/1.7/search-aggregations.html) are for – Nick Zadrozny Mar 08 '16 at 17:27
  • Here's another article I'd found very useful- http://stackoverflow.com/questions/23269280/how-to-config-elasticsearch-cluster-on-one-server-to-get-the-best-performace-on – Puneet Pandey Mar 10 '16 at 19:50
0

You have 3 ES nodes, optimal performance requires at least one shard per node. Heroku probably reports something else. Shards are properties of the certain index inside ES, not the ES cluster itself, so check how many shards your index has. But even with one shard your query should not be so slow, probably the documents were indexed in wrong way. You provided too few info about you index, your queries, your load.

Caching may help just as with any storage system, cons and pros are always the same.

xeye
  • 1,250
  • 10
  • 15