23

After studying DHH's and other blog articles about key-based cache expiration and Russian Doll Caching, I am still unsure how to handle one relation type. To be specific, a has_many relationship.

I will share the results of my research on a sample app. It is a little bit of story telling, so hang on. Let's say we have the following ActiveRecord models. All we care about is a proper change of the model's cache_key, right?

class Article < ActiveRecord::Base
  attr_accessible :author_id, :body, :title
  has_many :comments
  belongs_to :author
end

class Comment < ActiveRecord::Base
  attr_accessible :article_id, :author_id, :body
  belongs_to :author
  belongs_to :article, touch: true
end

class Author < ActiveRecord::Base
 attr_accessible :name
  has_many :articles
  has_many :comments
end

We already have one article, with one comment. Both by a different author. The goal is to have a change in the cache_key for the article in the following cases:

  1. Article's body or title changes
  2. Its comment's body changes
  3. Article's author's name changes
  4. Article's comment's author's name changes

So by default, we are good for case 1 and 2.

1.9.3-p194 :034 > article.cache_key
 => "articles/1-20130412185804"
1.9.3-p194 :035 > article.comments.first.update_attribute('body', 'First Post!')
1.9.3-p194 :038 > article.cache_key
 => "articles/1-20130412185913"

But not for case 3.

1.9.3-p194 :040 > article.author.update_attribute('name', 'Adam A.')
1.9.3-p194 :041 > article.cache_key
 => "articles/1-20130412185913"

Let's define a composite cache_key method for Article.

class Article < ActiveRecord::Base
  attr_accessible :author_id, :body, :title
  has_many :comments
  belongs_to :author

  def cache_key
    [super, author.cache_key].join('/')
  end
end

1.9.3-p194 :007 > article.cache_key
 => "articles/1-20130412185913/authors/1-20130412190438"
1.9.3-p194 :008 > article.author.update_attribute('name', 'Adam B.')
1.9.3-p194 :009 > article.cache_key
 => "articles/1-20130412185913/authors/1-20130412190849"

Win! But of course this does not work for case 4.

1.9.3-p194 :012 > article.comments.first.author.update_attribute('name', 'Bernard A.')
1.9.3-p194 :013 > article.cache_key
 => "articles/1-20130412185913/authors/1-20130412190849"

So what options are left? We could do something with the has_many association on Author, but has_many does not take the {touch: true} option, and probably for a reason. I guess it could be implemented somewhat along the following lines.

class Author < ActiveRecord::Base
  attr_accessible :name
  has_many :articles
  has_many :comments

  before_save do
    articles.each { |record| record.touch }
    comments.each { |record| record.touch }
  end
end

article.comments.first.author.update_attribute('name', 'Bernard B.')
article.cache_key
  => "articles/1-20130412192036"

While this does work. It has a huge performance impact, by loading, instantiating and updating every article and comment by that other, one by one. I don't believe it is a proper solution, but what is?

Sure the 37signals use case / example might be different: project -> todolist -> todo. But I imagine a single todo item also belonging to a user.

How would one solve this caching problem?

Graham Conzett
  • 8,344
  • 11
  • 55
  • 94
mlangenberg
  • 1,087
  • 11
  • 21
  • This doesn't address the caching issue specifically, but implementing touch doesn't have to be so intensive. You could simply do something like `articles.update_all(updated_at: Time.now)`, which would result in one op for articles (and one for comments). – numbers1311407 Jun 05 '13 at 05:31
  • @numbers1311407 The problem with that method is `update_all` only executes SQL, no callbacks are performed so subsequent `touch`es will not happen, and `cache_key`s on in memory objects are not regenerated. – Graham Conzett Jun 05 '13 at 06:38
  • Sure, but is automatic updating of in-memory objects really your concern? This is an issue in the rails console, but typically not in the real world where the author is changing his name in one action, then he and others are rendering comment views in another. If you do find yourself in a situation where you need to refresh the cache key on an in-memory object, just `reload` it. – numbers1311407 Jun 05 '13 at 16:06
  • @numbers1311407 The bigger issue of the two is chaining the `touch` calls. If a user updates a his name and we call `update_all` on his comments, the `belongs_to :article, touch: true` on the comment will not fire and the fragment cache for the article will not be expired. At least that's what I've seen, please correct me if I'm wrong. You could always expire all articles that a user has comments on by hand, but that would get hard to maintain as the tree gets larger. Unfortunately I don't see an alternative currently. – Graham Conzett Jun 05 '13 at 19:05
  • 1
    No you're right, `update_all` would not run the callbacks. I suppose you could add a `has_many :commented_articles through: :comments` type association to author, and touch those in the callback as well. But I see your point, russian doll does become messy very quickly when you're rendering content outside of a purely hierarchical structure (like the author's username). It is strange that you never see a mention of the solutions in russian doll caching writeups, like the linked 37Signals. – numbers1311407 Jun 05 '13 at 20:24
  • I'm voting this for close because is too localized. You might want to check for [cache_digest](https://github.com/rails/cache_digests) gem, which implements russian doll caching, to see how to do it. - just discovered that bounty hunter questions can't be closed. Will mark to close once it is finished. – fotanus Jun 06 '13 at 14:25
  • This is not too localized. Russian doll caching is the canonical (recommended by DHH and the edge railsguide) caching strategy for Rails. This question points out a gotcha with the strategy and asks for an appropriate solution. `cache_digest` is about managing template dependencies, tied to the template files themselves, which frees you from explicit versioning. I haven't used it, but I don't believe it has anything to do with this problem (non-hierarchical model data dependencies), and in fact, would suffer from it just the same. – numbers1311407 Jun 06 '13 at 19:33
  • @fotanus This has nothing to do with cache digests. The problem cache_digests attempts to solve is one where you change your templates and need to expire the cache without the data changing. This question is about expiring fragment caches in has_many relationships, I have updated the question and tags for clarity. – Graham Conzett Jun 07 '13 at 04:08
  • 1
    Not sure if this is a good solution, but have you read [this article](http://mark.stratmann.me/content_items/rails-caching-strategy-using-key-based-approach) about how this guy solves it? Even in [this DHH blog post](http://37signals.com/svn/posts/3112-how-basecamp-next-got-to-be-so-damn-fast-without-using-much-client-side-ui) he has user names listed in that project cache (first screenshot). How do they handle busting the cache when one of those names cahnges? I'd love to get a better answer myself. It seems like this is an important question. – Slickrick12 Jun 12 '13 at 21:46

2 Answers2

8

One method I did stumble on would be to handle this via the cache keys. Add a has_many_through relationship for commenters to the article:

class Article < ActiveRecord::Base
  attr_accessible :author_id, :body, :title
  has_many :comments
  has_many :commenters, through: :comments, source: :author
  belongs_to :author
end

Then in article/show we would construct the cache key like this:

<% cache [@article, @article.commenters, @article.author] do %>
  <h2><%= @article.title %></h2>
  <p>Posted By: <%= @article.author.name %></p>
  <p><%= @article.body %></p>
  <ul><%= render @article.comments %></ul>
<% end %>

The trick is that the cache key generated from the commenters association will change whenever a comment is added, deleted, or updated. While this does require extra SQL queries to generate the cache key, it plays nicely with Rails' low level caching and adding something like the identity_cache gem can easily help with that.

I would like to see if other people have cleaner solutions to this though.

Graham Conzett
  • 8,344
  • 11
  • 55
  • 94
  • This technique works well for the show action of a single record, but it's not viable for the top level (outer) russian doll cache for an index action of several records -- where the cache of each each record has the issue described here – Chris Beck May 19 '14 at 17:51
  • You could use a custom cache key helper in that case, detailed here in the Rails guides: http://edgeguides.rubyonrails.org/caching_with_rails.html#highlighter_32543 – Graham Conzett May 19 '14 at 18:54
  • Roger that -- like the original post does with `def cache_key [super, author.cache_key].join('/') end` – Chris Beck May 20 '14 at 20:22
0

As advised here https://rails.lighthouseapp.com/projects/8994/tickets/4392-add-touch-option-to-has_many-associations, in my case i just created a after_save callback to update the timestamps of related objects.

  def touch_line_items_and_tactics
    self.line_item_advertisements.all.map(&:touch)
  end

An aside, we built our rails app on a legacy database which has last_modified_time as the column name and it semantics was "when the user last modified it". So because of the differing semantics, we could not use the :touch option out of the box. I had to monkeypatch the cache_key and touch methods like this https://gist.github.com/tispratik/9276110 so as to store the updated timestamp in memcached instead of the databases's updated_at column.

Also note that i could not use the default cache_timestamp_format from Rails as it provided with timestamps only upto seconds. I felt a need for having a more granular timestamp so i chose :nsec (nanoseconds).

Timestamp with cache_timestamp_format: 20140227181414
Timestamp with nsec: 20140227181414671756000

Pratik Khadloya
  • 12,509
  • 11
  • 81
  • 106