0

I need to add a counter cache to a database that has about 4 million rows. The normal way I would do this is in a migration like so:

class AddClicksCounterCacheToPosts < ActiveRecord::Migration
  def change
    add_column :posts, :clicks_count, :integer
    Post.find_each { |post| Post.reset_counters(post.id, :clicks) }
  end
end

However this is much too slow. I came across a way to do this in pure SQL but it looks like it was written for MySQL and I can't seem to get it to work for Postgres. Here's what I am trying:

class AddClicksCounterCacheToPosts < ActiveRecord::Migration
  def change
    add_column :posts, :clicks_count, :integer
    execute <<-eos
      update posts, (select
                      id as post_id, coalesce(count, 0) as count
                    from posts left join
                      (select post_id, count(id) as count from clicks group by post_id) as count_table
                    on
                      posts.id = count_table.post_id) as count_table
      set
        posts.clicks_count = count_table.count
      where
        posts.id = count_table.post_id
    eos
  end
end

And this is the error I get:

ActiveRecord::StatementInvalid: PG::SyntaxError: ERROR:  syntax error at or near ","

I'm pretty sure commas are allowed in postgres but to be honest I don't write too much raw postgres so I'm not sure.

Any idea how to convert this to Postgres?

goddamnyouryan
  • 6,854
  • 15
  • 56
  • 105

2 Answers2

1

Automated translation from MySQL to Postgres (without assessing of the correctness of the original query):

update posts
set posts.clicks_count = count_table.count
from (
    select id as post_id, coalesce(count, 0) as count
    from posts
    left join (
        select post_id, count(id) as count
        from clicks group by post_id) as count_table
    on posts.id = count_table.post_id) as count_table
where
    posts.id = count_table.post_id;
klin
  • 112,967
  • 15
  • 204
  • 232
1

With proper UPDATE syntax the same could look like this in Postgres:

UPDATE posts p
SET    clicks_count = ct.ct
FROM  (
   SELECT po.id, COALESCE(c.ct, 0) AS ct
   FROM   posts po
   LEFT   JOIN  (
      SELECT post_id, count(*) AS ct
      FROM   clicks
      GROUP  BY 1
      ) c ON c.post_id = po.id
   ) ct
WHERE  p.id = ct.id
AND    p.clicks_count <> ct.ct;  -- avoid empty update

I added another condition AND p.clicks_count <> ct.ct to avoid empty updates. Details:

It might be faster to run these two queries instead:

UPDATE posts p
SET    clicks_count = p.ct
FROM  (
   SELECT post_id, count(*) AS ct
   FROM   clicks
   GROUP  BY 1
   ) ct
WHERE  p.id = p.category_id
AND    clicks_count <> p.ct;

UPDATE posts p
SET    clicks_count = 0
WHERE  NOT EXISTS (SELECT 1 FROM clicks WHERE post_id = p.id)
AND    p.clicks_count <> 0;
Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228