Most efficient way to count rows of a query

Question

I'm using Hibernate to retrieve the number of rows for a specific query. Let's say I have a table called 'Person' with various columns. One of those columns is 'name'.

If I wanted to get the number of people with the name of 'Andrew', which of these ways would be most efficient? Assuming there is a performance difference between some/all of them. Is there a better way to do this using Hibernate/SQL?

(1) Select all columns

Query query = session.createQuery("from Person where name= :name");
query.setParameter("name", name);
List result = query.list();
int count = result.size();

(2) Select just the name column

Query query = session.createQuery("select name from Person where name= :name");
query.setParameter("name", name);
List result = query.list();
int count = result.size();

(3) Using Count in the query

Query query = session.createQuery("select count(*) from Person where name= :name");
query.setParameter("name", name);
long count = (Long) query.uniqueResult();

(4) Using Count with the name column in the query

Query query = session.createQuery("select count(name) from Person where name= :name");
query.setParameter("name", name);
long count = (Long) query.uniqueResult();

Edit: Sorry, I had two number 3's in my list

Well right now I'm just writing some experimental code against HSQLDB. I guess it was more of a general question across all DBs. Though judging by your question, the answer may differ across DBs? — digiarnie, Jul 29 '10 at 01:42

score 14 · Accepted Answer · edited May 23 '17 at 12:22

Don't retrieve a result set if you just want to count the number of rows, this just means useless overhead:

you'll get more stuff than actually wanted (whether you're selecting all columns or just one)
you'll need to send them over the wire
you'll need to create instances (whether it's a full Person entity or just a String) for nothing.

In other words, if you only want to count, don't do it on the Java side, DBMS are optimized for this task and will do a much better job.

This excludes (1) and (2).

Regarding (3) and (4), note that there is a difference between count(*) and count(col) in general:

count(*) counts ALL rows
count(col) counts rows with non-null values of col

So they will give different results in performance and query result if col can be NULL (the count(*) being faster), otherwise identical performance.

I'd use (3).

Similar questions

score 0 · Answer 2 · answered Jul 29 '10 at 01:35

0

The count(*) method has profiled to be significantly faster than the size() method for my company. It is certainly more memory efficient since you aren't pulling across column data that you won't use. I don't know if count(name) makes a difference.

answered Jul 29 '10 at 01:35

Alain O'Dea

21,033
1
58
84

In case anyone is wondering, count(*) vs count(name) does make **A LOT** of difference: https://stackoverflow.com/questions/33480424/eclipselink-jpa-generates-count-queries-using-countid-instead-count – Hendy Irawan Dec 30 '17 at 12:03

score 0 · Answer 3 · answered Jul 29 '10 at 01:38

0

The less you put inside the COUNT() function the better. If you don't need any of the information from the table, I would say use COUNT(1). You can use COUNT(name) or COUNT(*) as long as your tables are properly indexed.

answered Jul 29 '10 at 01:38

Cᴏʀʏ

105,112
20
162
194

2

The `*` in `COUNT(*)` will not expand into all columns and `count(1)` is equivalent to `count(*)`. – Pascal Thivent Jul 29 '10 at 03:15
I would agree, except for in PostgreSQL PL/SQL. At least the last version of PostgreSQL I used there was a difference. In Oracle, MS SQL Server, and MySQL, the two are definitely equivalent. I think it's just a bad habit for me. If you're using COUNT() to see if any records exist, for MS SQL the EXISTS() function is more efficient than COUNT(). – Cᴏʀʏ Jul 29 '10 at 03:28

Most efficient way to count rows of a query

3 Answers3

Similar questions

Linked