ordering by alpha numeric string in SQL

Question

I'm working with a query that looks like this:

SELECT
  CASE 
    WHEN num BETWEEN 1 AND 5 THEN '1-5'
    WHEN num BETWEEN 6 AND 10 THEN '6-10'
    WHEN num BETWEEN 11 AND 20 THEN '11-20'
    WHEN num > 20 THEN '20+'
END AS bucket,
COUNT(1)
FROM data
GROUP BY 1

...

I'd like to order the results by the different values of bucket so the results look something like this

bucket	count(1)
1-5	10
6-10	11
11-20	17
20+	8

Since bucket is a string, simply sorting by that column doesn't do the trick.

Are there any string operations in SQL that can help me? For reasons I won't get into the CASE statements are the easiest way for me to get results in the format I need, so I'd rather deal with a bunch of string operations to get it in the order I want than change the structure of the query

Welcome to SO. You could go with 01-05, 06-10, 11-20, 20+ as your strings. — topsail, Jun 13 '22 at 23:46
`GROUP BY 1` <-- [You should change this to use the explicit column name](https://stackoverflow.com/questions/7392730/what-does-sql-clause-group-by-1-mean) - also `COUNT(1)` [is not idiomatic](https://stackoverflow.com/questions/1221559/count-vs-count1-sql-server), is there a reason you're not using `COUNT(*)`? — Dai, Jun 13 '22 at 23:47
Is `num` an integer or non-integer value? If it's non-integer, what bucket should `5.5` go into? — Dai, Jun 13 '22 at 23:50
Yeah, If this is going to be a production query or a query you intend to keep using for a while, it is better to make the grouped on column explicit by name - it makes the intention of the query more clear and protects it from unitentional bugs if there are future changes in the database table structure. Indeed - it is a little hard to be sure what this query means with that unnamed field in there just as (1) - is it the buckets, or another field? I am not really sure! — topsail, Jun 13 '22 at 23:53

score 1 · Answer 1 · answered Jun 13 '22 at 23:59

My approach uses integer lower-bounds for the buckets (as integers are nicer to deal with than strings), and only generates the string bucket names at the very end, after grouping.

WITH histogramData AS (
    SELECT
        CASE
            WHEN num <  1 THEN NULL
            WHEN num <  6 THEN 1
            WHEN num < 11 THEN 6
            WHEN num < 21 THEN 11 ELSE 21
        END AS Bucket,
        data.*
    FROM
        data
),
grouped AS (
    SELECT
        h.Bucket,
        COUNT(*) AS [Count]
    FROM
        histogramData AS h
    GROUP BY
        h.Bucket
)
SELECT
    CASE ISNULL( g.Bucket, 0 )
        WHEN  0 THEN NULL,
        WHEN  1 THEN '1-5'
        WHEN  6 THEN '6-10'
        WHEN 11 THEN '11-20' ELSE '21+'
    END AS [Bucket Name],
    g.[Count]
FROM
    grouped
ORDER BY
    g.Bucket

Is this solution faster than your first proposed solution? (despite the two ctes) @Dai — lemon, Jun 14 '22 at 00:29
@lemon The only way to find out is by running it and looking at the query execution plan. — Dai, Jun 14 '22 at 00:34

score 0 · Answer 2 · answered Jun 13 '22 at 23:48

0

SELECT
  CASE 
    WHEN num BETWEEN 1 AND 5 THEN '01-05'
    WHEN num BETWEEN 6 AND 10 THEN '06-10'
    WHEN num BETWEEN 11 AND 20 THEN '11-20'
    WHEN num > 20 THEN '20+'
END AS bucket,
COUNT(1)
FROM data
GROUP BY 1

answered Jun 13 '22 at 23:48

topsail

2,186
3
17
17

ordering by alpha numeric string in SQL

2 Answers2