9

I'm using Hibernate to retrieve the number of rows for a specific query. Let's say I have a table called 'Person' with various columns. One of those columns is 'name'.

If I wanted to get the number of people with the name of 'Andrew', which of these ways would be most efficient? Assuming there is a performance difference between some/all of them. Is there a better way to do this using Hibernate/SQL?

(1) Select all columns

Query query = session.createQuery("from Person where name= :name");
query.setParameter("name", name);
List result = query.list();
int count = result.size();

(2) Select just the name column

Query query = session.createQuery("select name from Person where name= :name");
query.setParameter("name", name);
List result = query.list();
int count = result.size();

(3) Using Count in the query

Query query = session.createQuery("select count(*) from Person where name= :name");
query.setParameter("name", name);
long count = (Long) query.uniqueResult();

(4) Using Count with the name column in the query

Query query = session.createQuery("select count(name) from Person where name= :name");
query.setParameter("name", name);
long count = (Long) query.uniqueResult();

Edit: Sorry, I had two number 3's in my list

digiarnie
  • 22,305
  • 31
  • 78
  • 126
  • What database and version are you using? – Mark Byers Jul 29 '10 at 01:38
  • Well right now I'm just writing some experimental code against HSQLDB. I guess it was more of a general question across all DBs. Though judging by your question, the answer may differ across DBs? – digiarnie Jul 29 '10 at 01:42

3 Answers3

14

Don't retrieve a result set if you just want to count the number of rows, this just means useless overhead:

  • you'll get more stuff than actually wanted (whether you're selecting all columns or just one)
  • you'll need to send them over the wire
  • you'll need to create instances (whether it's a full Person entity or just a String) for nothing.

In other words, if you only want to count, don't do it on the Java side, DBMS are optimized for this task and will do a much better job.

This excludes (1) and (2).

Regarding (3) and (4), note that there is a difference between count(*) and count(col) in general:

  • count(*) counts ALL rows
  • count(col) counts rows with non-null values of col

So they will give different results in performance and query result if col can be NULL (the count(*) being faster), otherwise identical performance.

I'd use (3).

Similar questions

Community
  • 1
  • 1
Pascal Thivent
  • 562,542
  • 136
  • 1,062
  • 1,124
0

The count(*) method has profiled to be significantly faster than the size() method for my company. It is certainly more memory efficient since you aren't pulling across column data that you won't use. I don't know if count(name) makes a difference.

Alain O'Dea
  • 21,033
  • 1
  • 58
  • 84
  • In case anyone is wondering, count(*) vs count(name) does make **A LOT** of difference: https://stackoverflow.com/questions/33480424/eclipselink-jpa-generates-count-queries-using-countid-instead-count – Hendy Irawan Dec 30 '17 at 12:03
0

The less you put inside the COUNT() function the better. If you don't need any of the information from the table, I would say use COUNT(1). You can use COUNT(name) or COUNT(*) as long as your tables are properly indexed.

Cᴏʀʏ
  • 105,112
  • 20
  • 162
  • 194
  • 2
    The `*` in `COUNT(*)` will not expand into all columns and `count(1)` is equivalent to `count(*)`. – Pascal Thivent Jul 29 '10 at 03:15
  • I would agree, except for in PostgreSQL PL/SQL. At least the last version of PostgreSQL I used there was a difference. In Oracle, MS SQL Server, and MySQL, the two are definitely equivalent. I think it's just a bad habit for me. If you're using COUNT() to see if any records exist, for MS SQL the EXISTS() function is more efficient than COUNT(). – Cᴏʀʏ Jul 29 '10 at 03:28