2

I have a question related to ORDER BY or GROUP BY clause.

For example I have below queries

SELECT country_name,COUNT(*) FROM user_location
WHERE country_name IS NOT NULL
GROUP BY country_name 
ORDER BY COUNT(*) DESC

And

SELECT country_name,COUNT(*) As Total FROM user_location
WHERE country_name IS NOT NULL
GROUP BY country_name 
ORDER BY Total DESC

In 2nd query I am using alias Total for COUNT(*) in ORDER BY clause.

Is there any performance related differences in two queries ?

prograshid
  • 876
  • 3
  • 12
  • 32

2 Answers2

1

I've run the following test on a table with 1M products randomly related to 10K categories (MariaDB 10.0.19):

select p.categoryId, count(*) as total
from products p
group by p.categoryId
having count(*) = 100

Execution time: 156 msec

select p.categoryId, count(*) as total
from products p
group by p.categoryId
having total = 100

Execution time: 156 msec

So there doesn't seem to be any difference in performance.

Note that with ORDER BY the engine will copy the result into a temporary table (See EXPLAIN: Using temporary; Using filesort). So the value can't be recalculated, even when you use ORDER BY COUNT(*).

However - There is a difference (which I can not explain) when I use ORDER BY COUNT(DISTINGT ...):

select p.categoryId, count(distinct p.productData) as total
from products p
group by p.categoryId
order by total

Profile: 863 msec for Copying to tmp table

select p.categoryId, count(distinct p.productData) as total
from products p
group by p.categoryId
order by count(distinct p.productData)

Profile: 963 msec for Copying to tmp table

Paul Spiegel
  • 30,925
  • 5
  • 44
  • 53
  • https://dev.mysql.com/doc/refman/5.7/en/show-profile.html "profiling is per process and not per thread. This means that activity on threads within the server other than your own may affect the timing information that you see." – philipxy Aug 11 '17 at 15:20
  • @philipxy It's my dev-machine which is doing "nothing" in the background. – Paul Spiegel Aug 11 '17 at 15:25
  • Well explained @PaulSpiegel. So what is the recommended way to use them ? – prograshid Aug 11 '17 at 16:46
  • 1
    @prograshid I wouldn't care about the performance unless I can measure a significant difference. However - Since using an alias is not slower, I'd usualy us it. This also prevents code duplication. Consider if you have to change `count(*)` to `count(distinct id)` - Using alias you would need to change the code only in the SELECT clause. – Paul Spiegel Aug 11 '17 at 16:56
0

An Alias is just a synonym, so any timing differences are probably due to the phase of the moon.

Rick James
  • 135,179
  • 13
  • 127
  • 222