0

I'm building a website using php/mysql where there will be Posts and Comments.

Posts need to show number of comments they have. I have count_comments column in Posts table and update it every time comment is created or deleted.

Someone recently advised me that denormalazing this way is a bad idea and I should be using caching instead.

hakre
  • 193,403
  • 52
  • 435
  • 836
DavidW
  • 5,069
  • 14
  • 46
  • 68
  • Is there any table relation between post and comment table? – sathishkumar Jan 24 '12 at 18:10
  • 1
    Different people worry about different things. If you're new to programming, you should be focusing on learning and making things work, not on the tiny little tidbits. When I started building, I used MySQL for everything and slowly moved more and more into caching. – animuson Jan 24 '12 at 18:11
  • Not a good idea for critical data as it can get out of sync unless you are very very careful. In your case, it will save some query time and if there 80 comments but it says 81 - not the end of the world. Your trade-off. Every so often you could run a process that corrects any miscounts. – Mouse Food Jan 24 '12 at 18:13

6 Answers6

2

My take is: You are doing the right thing. Here is why:

See the field count_comments as not being part of your data model - this is easily provable, you can delete all contents of this field and it is trivial to recreate it.

Instead see it as a cache, the storage of which is just co-located with the post - perfectly smart, as you get it for free whenever you have to query for the post(s)

hakre
  • 193,403
  • 52
  • 435
  • 836
Eugen Rieck
  • 64,175
  • 10
  • 70
  • 92
1

I do not think this is a bad approach.

One thing i do recognize is that its very easy to introduce side effects as code base is expanded by having a more rigid approach. The nice part is at some point the amount of rows in the database will have to be calculated or kept track of, there is not really a way of getting out of this.

I would not advise against this. There are other solutions to getting comment counts. Check out Which is fastest? SELECT SQL_CALC_FOUND_ROWS FROM `table`, or SELECT COUNT(*)

The solution is slower upon selects, but requires less code to keep track of comment count.


I will say that your approach avoids LIMIT DE-optimization, which is a plus.

Community
  • 1
  • 1
ThePrimeagen
  • 4,462
  • 4
  • 31
  • 44
1

This is an optimization that is almost never needed for two reasons:

1) Proper indexing will make simple counts extremely fast. Ensure that your comments.post_id column has an index.

2) By the time you need to cache this value, you will need to cache much more. If your site has so many posts, comments, users and traffic that you need to cache the comments total, then you will almost definitely need to be employing caching strategies for much of your data/output (saving built pages to static, memcache, etc.). Those strategies will, no doubt, encompass your comments total, making the table field approach moot.

webbiedave
  • 48,414
  • 8
  • 88
  • 101
  • It would be best for me to measure these speeds, but I do not have the data to measure, so I guess I can find out if some day site grows large. – DavidW Jan 25 '12 at 06:28
0

I have no idea what was meant by "Caching" and I'll be interested in some other answer that the one I have to offer:

Remove redundant information from your database is important and, in a "Believer way" (means that I didn't really test it, its merely speculative), I think that using SUM() function from your database is a better way to go for it. Assuming that all your comments has a post_id, all you need is something like:

SELECT SUM(id) FROM comments WHERE id = {post_id_variation_here}

That way, you reduce 1 constant CRUD happening just to read how much comments there are and increase performance.

Marco Aurélio Deleu
  • 4,279
  • 4
  • 35
  • 63
0

Unless you haven't hundreds or thousands of hits per seconds on your application there's nothing wrong about using a SQL statement like this:

select posts_field1, ..., (select count(*) from comments where comments_parent = posts_id) as commentNumber from posts

you can go with caching the html output of your page anyway. than no database query has to be done at all.

zaphod1984
  • 836
  • 2
  • 7
  • 22
0

Maby you could connect the post and comment tables to each other and count the comments rows in mysql with the mysql function: mysql_num_rows. Like so:

Post table

postid*
postcontent

Comment table

commentid
postid*
comment

And then count the comments in mysql like:

$link = mysql_connect("localhost", "mysql_user", "mysql_password"); 
mysql_select_db("database", $link);

$result = mysql_query("SELECT * FROM commenttable WHERE postid = '1'", $link); 
$num_rows = mysql_num_rows($result);
Joey
  • 645
  • 5
  • 19