0

I have a record of students and I need to group them by age. I have this scenario:

  • If Age group has only one member, display his/her name and sex
  • Else
    • Display 'SOME STRING' as name
    • Check sex of members, if All FEMALE, return FEMALE. If all MALE, return MALE, else return MIX

I came up with this query

    SELECT Count(id) as GROUP_SIZE,
    CASE WHEN COUNT(id) = 1 THEN name ELSE 'SOME STRING' END as name,
    CASE WHEN COUNT(id) = 1 THEN sex ELSE (
       CASE WHEN COUNT(CASE WHEN sex = 'MALE' THEN 1 END) = 0 THEN 'FEMALE'
            WHEN COUNT(CASE WHEN sex = 'FEMALE' THEN 1 END) = 0 THEN 'MALE'
            ELSE 'MIX'
       END
    ) END as sex
    FROM students GROUP BY age                

This query output my desired scenario but I have multiple calls to Count(id). Will this impact performance as multiple Count was called? Or if Count(id) was performed once, will it have a constant time on succeeding calls to Count(id)? I tried CASE WHEN GROUP_SIZE but it is not working. Please advise how I can improve this. Thanks.

Frank Smith
  • 1,717
  • 4
  • 18
  • 32

3 Answers3

1

I'm unsure quite how optimising MySQL is when it comes to count(COLUMN), however if you use count(1) instead, you'll get the (at least equal) best performance.

Endareth
  • 492
  • 3
  • 15
  • 2
    reference: http://stackoverflow.com/questions/2876909/count-and-countcolumn-name-whats-the-diff – Barmar Jun 26 '15 at 07:25
1

I would expect it to reuse the value. When you use GROUP BY, it has to collect all the rows into groups. The data structure that holds the groups probably has the size of the group in an element, so it doesn't have to do any actual counting, it's done during the process of making the groups.

Barmar
  • 741,623
  • 53
  • 500
  • 612
1

Might not matter much for performance, but see how this feels:

SELECT  age,   -- (I think you forgot this)
        GROUP_SIZE,
        IF (GROUP_SIZE = 1, a_name, 'SOME STRING) AS name,
        IF (min_sex = max_sex, max_sex, 'MIX') AS sex
    FROM  
      ( SELECT  COUNT(*) AS GROUP_SIZE,
                MAX(name) AS a_name,
                MAX(sex) AS max_sex,
                MIN(sex) AS min_sex,
            FROM  students
            GROUP BY  age 
      ) z
    ORDER BY  age;   -- (you probably wanted this)

COUNT(*) is the normal pattern. COUNT(x) checks x for being NULL and does not count them.

Rick James
  • 135,179
  • 13
  • 127
  • 222