Consider a voting system implemented in PostgreSQL, where each user can vote up or down on a "foo". There is a foo
table that stores all the "foo information", and a votes
table that stores the user_id
, foo_id
, and vote
, where vote
is +1 or -1.
To get the vote tally for each foo, the following query would work:
SELECT sum(vote) FROM votes WHERE foo.foo_id = votes.foo_id;
But, the following would work just as well:
(SELECT count(vote) FROM votes
WHERE foo.foo_id = votes.foo_id
AND votes.vote = 1)
- (SELECT count(vote) FROM votes
WHERE foo.foo_id = votes.foo_id
AND votes.vote = (-1))
I currently have an index on votes.foo_id
.
Which is a more efficient approach? (In other words, which would run faster?) I'm interested in both the PostgreSQL-specific answer and the general SQL answer.
EDIT
A lot of answers have been taking into account the case where vote
is null. I forgot to mention that there is a NOT NULL
constraint on the vote column.
Also, many have been pointing out that the first is much easier to read. Yes, it is definitely true, and if a colleague wrote the 2nd one, I would be exploding with rage unless there was a performance necessity. Never the less, the question is still on the performance of the two. (Technically, if the first query was way slower, it wouldn't be such a crime to write the second query.)