0

In SQL Server, I'm running a query on users age groups on data where, for some years, there are zero users per age group. For example there were users in 2013 in the "18-21" age group, so the query returns the next age group, "22-25", as the first row because there were no entries containing "18-21." Instead, I would like to return a row that contains 18-21, but has 0 as the value for number of users.

Currently, I have:

SELECT YEAR, AGE_GROUP, SUM(USERS) as usercount,
FROM USERS
WHERE YEAR = '2013'
AND PRIMARY_GROUP = 'NT'
GROUP BY YEAR, AGE_GROUP

This returns:

YEAR  AGE_GROUP usercount
2014    22-25      200
2014    25-28       10

I want it to return:

YEAR  AGE_GROUP usercount
2014    18-21       0
2014    22-25      200
2014    25-28       10

How can I create a row for specific values that don't exist and fill the count with 0 values?

For the record, I DO in fact have a column called 'users' in the users table. Confusing, I know, but it's a stupidly named schema that I took over. The Users table contains data ABOUT my users for reporting. It should probably have been named something like Users_Reporting.

KinsDotNet
  • 1,500
  • 5
  • 25
  • 53
  • How do you know what all possible age groups are if they're not present in your USERS table? – Kyle Hale Nov 27 '14 at 13:40
  • @KyleHale: He's familiar with normal human lifespans, probably. – siride Nov 27 '14 at 15:31
  • @siride What kind of stupid response is that? He has 3 specific group ranges listed, you have no idea if the others are evenly distributed, where they end (maybe he's only tracking people 18-50), or any other info. RDMBSes don't work on magic, you have to have a defined data set somewhere to be able to reproduce it in a query. – Kyle Hale Nov 28 '14 at 16:37
  • 1
    @KyleHale: it doesn't matter because all of the answers require the OP to pick the ranges he wants anyway. The answer to your question doesn't really change the result any. – siride Nov 28 '14 at 17:17
  • @siride It absolutely matters if we want to provide an actual solution and not just some theoretical solution like Conan's. Also, maybe it will make the OP actually think about what they're asking for. – Kyle Hale Nov 28 '14 at 17:51

4 Answers4

3

I assume you have another table that contain all rows Age Group.

TABLE NAME: AGEGROUPS

AGE_GROUP
18-21
22-25
25-28

Try this:

SELECT '2014' AS YEAR, AG.AGE_GROUP, COALESCE(TB.usercount, 0) AS usercount
FROM (
    SELECT YEAR, AGE_GROUP, SUM(USERS) as usercount,
    FROM USERS
    WHERE YEAR = '2014'
    AND PRIMARY_GROUP = 'NT'
    GROUP BY YEAR, AGE_GROUP
) AS TB
RIGHT JOIN AGEGROUPS AG ON TB.AGE_GROUP=AG.AGE_GROUP
Conan
  • 3,074
  • 1
  • 22
  • 24
  • 1
    I'd use a `LEFT JOIN` with the opposite table ordering, but the result is the same. This is probably the best way to do it. – siride Nov 27 '14 at 15:21
1

You do this by joining with a "fake" list of age group+years:

SELECT AGE_GROUPS.YEAR, AGE_GROUPS.AGE_GROUP, COALESCE(SUM(USERS), 0) as usercount
FROM (
    SELECT YEAR, AGE_GROUP
    FROM (
        SELECT '18-21' AS AGE_GROUP
        UNION SELECT '22-25'
        UNION SELECT '25-28'
    ) AGE_GROUPS, (SELECT DISTINCT YEAR FROM USERS) YEARS
) AGE_GROUPS
LEFT JOIN USERS ON (USERS.AGE_GROUP = AGE_GROUPS.AGE_GROUP AND USERS.YEAR = AGE_GROUPS.YEAR)
WHERE AGE_GROUPS.YEAR = '2014'
GROUP BY AGE_GROUPS.YEAR, AGE_GROUPS.AGE_GROUP

You can also simplify this, assuming that your USERS table has all possible age groups ignoring a specific year:

SELECT AGE_GROUPS.YEAR, AGE_GROUPS.AGE_GROUP, COALESCE(SUM(USERS), 0) as usercount
FROM (
    SELECT YEAR, AGE_GROUP
    FROM (SELECT DISTINCT AGE_GROUP FROM USERS) AGE_GROUPS, (SELECT DISTINCT YEAR FROM USERS) YEARS
) AGE_GROUPS
LEFT JOIN USERS ON (USERS.AGE_GROUP = AGE_GROUPS.AGE_GROUP AND USERS.YEAR = AGE_GROUPS.YEAR)
WHERE AGE_GROUPS.YEAR = '2014'
GROUP BY AGE_GROUPS.YEAR, AGE_GROUPS.AGE_GROUP
Jonathan Amend
  • 12,715
  • 3
  • 22
  • 29
0

Assuming that all the years and age groups are in the table (some row for the table), then you can do this just with this table. The idea is to generate all the rows using a cross join and then use a left join to bring in the values you want:

select y.year, ag.age_group, count(u.year) as usercount
from (select 2013 as year) y cross join
     (select distinct age_group from users) ag left join
     users u
     on u.year = y.year and u.age_group = ag.age_group and
        u.primary_group = 'NT'
group by y.year, ag.age_group;

I don't know what sum(users) is supposed to be. If you do indeed have a column users.users, then use it with sum(). It looks like you really want count().

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • I'm sorry, I made an adjustment to the query. term_reporting was supposed to be year. I'm still not sure if this method would achieve my end goal. My current query returns 2014, 22-25, 200 as year, age_group and usercount respectively, I would like it to return 2014, 18-21, 0 as year, age_group and usercount, and THEN the next row should be 2014,22-25, 200. I edited my question to make my end result more clear. – KinsDotNet Nov 27 '14 at 03:24
-1

Try This...

SELECT YEAR, AGE_GROUP, isnull(Sum(USERS),0) as usercount,
FROM USERS
 WHERE YEAR = '2013'
AND PRIMARY_GROUP = 'NT'
GROUP BY YEAR, AGE_GROUP
Dhaval
  • 2,341
  • 1
  • 13
  • 16
  • This won't work. I'm not trying to make null values into zeroes, I'm trying to add a whole other row when a value doesn't exist. – KinsDotNet Nov 27 '14 at 03:38
  • @Dhaval: I think it's pretty obvious what's going on here, so Fiddle won't really help. – siride Nov 27 '14 at 15:21