4

I am building a Rails backend to an iPhone app.

After profiling my application, I have found the following call to be especially expensive in terms of performance:

@messages.as_json

This call returns about 30 message objects, each including many child records. As you can see, a single message json response may make many DB calls to be composed:

  def as_json(options={})

     super(:only => [...],
      :include => {
        :user => {...},
        :checkin => {...}
                     }},
        :likes => {:only => [...],
                      :include => { :user => {...] }}},
        :comments => {:only => [...],
                                    :include => { :user => {:only => [...] }}}
                   },
      :methods => :top_highlight)
  end

On average the @messages.as_jsoncall (all 30 objects) takes almost 1100ms.

Wanting to optimize I've employed memcached. With the solution below, when all my message objects are in cache, average response is now 200-300ms. I'm happy with this, but the issue I have is that this has made cache miss scenarios even slower. In cases where nothing is in cache, it now takes over 2000ms to compute.

   # Note: @messages has the 30 message objects in it, but none of the child records have been grabbed

    @messages.each_with_index do |m, i|
      @messages[i] = Rails.cache.fetch("message/#{m.id}/#{m.updated_at.to_i}") do
        m.as_json
      end
    end

I understand that there will have to be some overhead to check the cache for each object. But I'm guessing there is a more efficient way to do it than the way I am now, which is basically serially, one-by-one. Any pointers on making this more efficient?

pejmanjohn
  • 1,057
  • 3
  • 12
  • 26

2 Answers2

8

I believe Rails.cache uses the ActiveSupport::Cache::Store interface, which has a read_multi method for this exact purpose. [1]

I think swapping out fetch for read_multi will improve your performance because ActiveSupport::Cache::MemCacheStore has an optimized implementation of read_multi. [2]

Code

Here's the updated implementation:

keys = @messages.collect { |m| "message/#{m.id}/#{m.updated_at.to_i}" }
hits = Rails.cache.read_multi(*keys)
keys.each_with_index do |key, i|
  if hits.include?(key)
    @messages[i] = hits[key]
  else
    Rails.cache.write(key, @messages[i] = @messages[i].as_json)
  end
end

The cache writes are still performed synchronously with one round trip to the cache for each miss. If you want to cut down on that overhead, look into running background code asynchronously with something like workling.

Be careful that the overhead of starting the asynchronous job is actually less than the overhead of Rails.cache.write before you start expanding your architecture.

Memcached Multi-Set

It looks like the Memcached team has at least considered providing Multi-Set (batch write) commands, but there aren't any ActiveSupport interfaces for it yet and it's unclear what level of support is provided by implementations. [3]

accounted4
  • 1,685
  • 1
  • 11
  • 13
  • This looks great and will reduce round-trips to the cache. Just to educate me further why did you pass keys as "*keys" into read_multi. I've never used this convention in ruby/rails? Looks like a pointer to me. – pejmanjohn Jan 18 '13 at 00:32
  • Just realized that this is missing one nice thing about fetch, which is that it sets the missed reads with the m.as_json value. Any thoughts on fixing that? One approach that comes to mind is passing @messages and hits off to a background job and writing the missing objects asynchronously. – pejmanjohn Jan 18 '13 at 00:38
  • Info on the '*keys' syntax [here](http://stackoverflow.com/questions/4643277/given-an-array-of-arguments-how-do-i-send-those-arguments-to-a-particular-funct) – accounted4 Jan 18 '13 at 00:42
  • Yes, what I mean is that with 'fetch' the missing objects are written to cache so that on the next read it will be a hit. The way I understand the code above is that it will indeed get the full set of messages but missed objects will still be misses next time around. – pejmanjohn Jan 18 '13 at 00:45
  • Oh I see what you are saying, you are right. I'm not sure how to go about doing that asynchronously in Ruby. I'll modify my solution to synchronously populate the cache and then look into asynchronous. – accounted4 Jan 18 '13 at 00:48
  • @SebastianGoodman I've run across the async use case as well today, and have started a new question, specifically with that in mind: http://stackoverflow.com/questions/20724164/improving-rails-cache-write-by-setting-key-value-pairs-asynchronously – zealoushacker Dec 22 '13 at 00:49
3

As of Rails 4.1, you can now do fetch_multi and pass in a block.

http://api.rubyonrails.org/classes/ActiveSupport/Cache/Store.html#method-i-fetch_multi

keys = @messages.collect { |m| "message/#{m.id}/#{m.updated_at.to_i}" }
hits = Rails.cache.fetch_multi(*keys) do |key|
  @messages[i] = @messages[i].as_json
end

Note: if you're setting many items, you may want to consider writing to the cache in some sort of background worker.

skalb
  • 5,357
  • 3
  • 26
  • 23