0

This seemed like a very easy thing to do but I got stuck. I have a query like this:

select op.date, count(p.numberofoutstanding)
from people p
left join outstandingpunches op
on p.fullname = op.fullname
group by op.date

That outputs a table like this:

enter image description here

How can I sum over the dates so the sum for each row is equal to the sums up to that date? For example, the first column would be 27, the second would be 27 + 4, the third 27 + 4 + 11, etc.

I encountered this and this question, and I saw people are using OVER in their queries for this, but I'm confused by what do I have to partition. I tried partitioning by date but it's giving me incorrect results.

lukaabra
  • 83
  • 1
  • 10
  • 1
    You don't have to use partitioning if you don't need to, it is optional. Partition By will effectively scope the records into a windowed subset. – Chris Schaller Feb 24 '21 at 12:53

2 Answers2

1

You can use a cumulative sum. This looks like:

select op.date, count(*),
       sum(count(*)) over (order by op.date) as running_count
from people p join
     outstandingpunches op
     on p.fullname = op.fullname
group by op.date;

Note: I changed the join from a left join to an inner join. You are aggregating by a column in the second table. Your results have no examples of a NULL date column and that doesn't seem useful. Hence, it seems that rows are assumed to match.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • You need `ROWS UNBOUNDED PRECEDING` to ensure correct results, and for performance – Charlieface Feb 24 '21 at 13:22
  • @Charlieface . . . The default is `RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW`. The `ORDER BY` key is unique because of the `GROUP BY`, so your comment is simply wrong. – Gordon Linoff Feb 24 '21 at 13:57
  • Yes in this instance you are using a unique key, but I've seen so many examples where that is not the case that I would say *always* write it just out of principle. And the performance argument stands: as far as I am aware there is no logical in the compiler currently to optimize that case. – Charlieface Feb 24 '21 at 16:35
1

I believe you need to use sum and not count.

select o.date_c, 
       sum(sum(p.numberofoutstanding)) over (order by o.date_c)
from people p 
left join outstandingpunches o on p.fullname = o.fullname
group by o.date_c;

Here is a small demo:

DEMO

Have in mind that I have renamed your column date to date_c. I believe you should not use data types as column names.

VBoka
  • 8,995
  • 3
  • 16
  • 24
  • 1
    Hi @lukaabra, can you please tell me how is a sum of a count the correct answer ? Thanks! – VBoka Feb 24 '21 at 13:07
  • Hi. If I use sum of a sum I get much higher numbers than the actual case. My rows have the values: 71, 78, 99, etc., which is not correct. – lukaabra Feb 24 '21 at 13:11
  • 1
    And what about my demo, what is wrong with tat ? Can you correct it and send me the link after you correct it and run it ? Hvala. Here is the demo: https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=e5cdfc9c8e0cead21b7fef346556a38b – VBoka Feb 24 '21 at 13:13
  • 1
    Here is the corrected fiddle: https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=b1e19ec47901d2b179733c4727e279ab Using sum of sum gives out some wonky values – lukaabra Feb 24 '21 at 13:23
  • 1
    I understand now. Thanks! – VBoka Feb 24 '21 at 13:32