1

For a database table looking something like this:

id  | year | stint | sv
----+------+-------+---
mk1 | 2001 | 1     | 30
mk1 | 2001 | 2     | 20
ml0 | 1999 | 1     | 43
ml0 | 2000 | 1     | 44
hj2 | 1993 | 1     | 70

I want to get the following output:

 count
-------
     3

with the conditions being count the number of ids that have a sv > 40 for a single year greater than 1994. If there is more than one stint for the same year, add the sv points and see if > 40.

This is what I have written so far but it is obviously not right:

SELECT COUNT(DISTINCT id),
SUM(sv) as SV
FROM public.pitching
WHERE (year > 1994 AND sv >40);

I know the syntax is completely wrong and some of the conditions' information is missing but I'm not familiar enough with SQL and don't know how to properly do the summing of two rows in the same table with a condition (maybe with a subquery?). Any help would be appreciated! (using postgres)

Wintress
  • 67
  • 2
  • 9
  • Since one of the ids has two different years (1999 and 2000 for ml0) – Wintress Mar 03 '19 at 06:50
  • This would need more details on what the role of `stint` is. Could you provide more details with intermediate results? What is the key of the input table? Can there be duplicate (id, year, stint) records? – trincot Mar 03 '19 at 06:56
  • @trincot This is taken from baseball. Stint is when a player has played again during the same season in maybe a different team. So if Stint is 1 and sv (saves) is 30 and Stint 2 has 20 sv, then the player's total saves becomes 50 for that year. I'm trying to count how many players have more than 40 saves (total) per year. – Wintress Mar 03 '19 at 07:02

2 Answers2

1

You could use a nested query to get the aggregations, and wrap that for getting the count. Note that the condition on the sum must be in a having clause:

SELECT COUNT(id)
FROM (
    SELECT   id,
             year,
             SUM(sv) as SV
    FROM     public.pitching
    WHERE    year > 1994
    GROUP BY id, 
             year
    HAVING   SUM(sv) > 40 ) years

If an id should only count once even it fulfils the condition in more than one year, then do COUNT(distinct id) instead of COUNT(id)

trincot
  • 317,000
  • 35
  • 244
  • 286
  • Thank you! This works as needed (without the distinct). I just have a question regarding the last line - what does the years statement do? – Wintress Mar 03 '19 at 07:07
  • It gives an alias name to the subquery. See [this](https://stackoverflow.com/a/14767216/5459839) – trincot Mar 03 '19 at 07:10
1

You can try like following using sum and partition by year.

select count( distinct year) from
(
  select year, sum(sv) over (partition by year) s
  from public.pitching
  where year > 1994
) t where s>40

Online Demo

PSK
  • 17,547
  • 5
  • 32
  • 43