0

Question:

Write a query that counts the number of 300lb+ players for each of the following regions: West Coast (CA, OR, WA), Texas, and Other (everywhere else). Reference : https://mode.com/sql-tutorial/sql-case/

The relevant part of the table looks like this

name weight state
Giga 370 WY
Chad 350 OR

this is the correct solution:

SELECT CASE WHEN state IN ('CA', 'OR', 'WA') THEN 'West Coast'
            WHEN state = 'TX' THEN 'Texas'
            ELSE 'Other' END AS arbitrary_regional_designation,
        COUNT(*) AS players
  FROM benn.college_football_players
 WHERE weight >= 300
 GROUP BY 1

output:

| arbitrary_regional_designation | players  |
|:-------------------------------|:---------|
| Other                          | 1590   |
| West Coast                     | 186  |
| Texas                          | 208   | 

this makes sense to me, I am wondering why the below code doesn't give the right answer

SELECT CASE WHEN state IN ('CA','OR', 'WA') THEN 'West Coast'
            WHEN state = 'TX' THEN 'Texas'
            ELSE 'Other' END AS region,
      count(weight>=300) --what's different here?
FROM benn.college_football_players
group by region

output for the above query:

| region | count |
|:---- |:------:| 
| Other  | 20513    |
| West Coast  | 2944    |
| Texas  | 2841    |

would appreciate your insight.

Charlieface
  • 52,284
  • 6
  • 19
  • 43
AdR
  • 197
  • 1
  • 3
  • 10
  • What is the output with `count(case when weight>=300 then 1 else null end)`? I would guess that `weight>=300` evaluates to `True` or `False`, neither of which is `null`. – fen1x Mar 19 '23 at 04:44
  • Why have you tagged both SQL server and postgresql? They are completely different products – Dale K Mar 19 '23 at 04:53

1 Answers1

2

In the "correct solution" that you mentioned, the WHERE clause will filter the actual rows (i.e., remove them) so when you count using the WHERE clause, you're counting the remaining rows.

If you don't like the WHERE clause you could do something like:

SELECT CASE WHEN state IN ('CA','OR', 'WA') THEN 'West Coast'
            WHEN state = 'TX' THEN 'Texas'
            ELSE 'Other' END AS region,
       COUNT(*) FILTER (WHERE weight >= 300) AS relevant_weight_count
FROM benn.college_football_players
GROUP BY region

This should have the added benefit of counting when a region has zero players using the weight threshold (where as the WHERE clause wouldn't show the row altogether).

Putting the condition inside of the COUNT directly (without FITLER) will count the rows regarding of weight. You can find more information about this specific topic in other questions (e.g., here).

Eric Peña
  • 91
  • 2