0

I am trying to figure out how to count every product at every date such that count is incremental across all product,

this is dummy table for understanding , I have millions of records with thousands of different products

I am unable to query at every date for each product the count in incremental fashion along with miles as per date provided

CREATE TABLE Dummy_tab (
    empid int,
    date1_start date,
    name_emp varchar(255),
    product varchar(255),
    miles varchar(20)
);
 

INSERT INTO Dummy_tab VALUES
  (1, '2018-08-27', 'Eric', 'a',10),
  (1, '2018-08-28', 'Eric','b',10),
  (1, '2018-08-28', 'Eric','a',20),
  (2, '2020-01-8', 'Jack','d',10),
  (2, '2020-02-8', 'Jack','b',20),
  (2, '2020-12-28', 'Jack','b',20),
  (2, '2020-12-28', 'Jack','d',20),
  (2,'2021-10-28', 'Jack','c',20),
  (2, '2022-12-28', 'Jack','d',20),
  (3, '2018-12-31', 'Jane','',10),
  (3, '2018-12-31', 'Jane','',15);


My desired O/p is this 

Id  Date        a   b   c   d empty miles
1   2018-08-27  1   0   0   0   0   10
1   2018-08-28  2   1   0   0   0   20
2   2020-01-08  0   0   0   1   0   10
2   2020-02-08  0   1   0   1   0   20
2   2020-12-28  0   2   0   2   0   20
2   2021-10-28  0   2   1   2   0   20
2   2022-12-28  0   2   1   3   0   20
3   2018-12-31  0   0   0   0   1   10
3   2019-12-31  0   0   0   0   2   15

FOR EXAMPLE

Eric has 3 entry for ID =1 with product a on 2018 08 27 with product b on 2018 08 28 with product a on 2018 08 28

SO 1st entry a= 1 for ID=1 2nt entry is sum of previous and current so now a =2 for ID=1 and b= 1 as there were no entry earlier for b

Miles needs to be maximum miles for that date from past dates

Dale K
  • 25,246
  • 15
  • 42
  • 71
Dexter1611
  • 492
  • 1
  • 4
  • 15
  • 1
    Does this answer your question? [How to get cumulative sum](https://stackoverflow.com/questions/2120544/how-to-get-cumulative-sum) – Thom A Aug 12 '22 at 14:44
  • i am not sure , if this will work ,as i need every date for every product for every id – Dexter1611 Aug 12 '22 at 15:48

1 Answers1

3

You need to first (conditionally) aggregate your values here, and then you can do a cumulative SUM:

WITH Aggregates AS(
    SELECT empid AS Id,
           date1_start AS [Date],
           COUNT(CASE product WHEN 'a' THEN 1 END) AS a,
           COUNT(CASE product WHEN 'b' THEN 1 END) AS b,
           COUNT(CASE product WHEN 'c' THEN 1 END) AS c,
           COUNT(CASE product WHEN 'd' THEN 1 END) AS d,
           COUNT(CASE product WHEN '' THEN 1 END) AS empty,
           MAX(miles) AS miles
    FROM dbo.Dummy_tab
    GROUP BY empid, date1_start)
SELECT Id,
       [Date],
       SUM(a) OVER (PARTITION BY Id ORDER BY [Date]) AS a,
       SUM(b) OVER (PARTITION BY Id ORDER BY [Date]) AS b,
       SUM(c) OVER (PARTITION BY Id ORDER BY [Date]) AS c,
       SUM(d) OVER (PARTITION BY Id ORDER BY [Date]) AS d,
       SUM(empty) OVER (PARTITION BY Id ORDER BY [Date]) AS empty,
       miles
FROM Aggregates 
ORDER BY ID,
         [Date];
Thom A
  • 88,727
  • 11
  • 45
  • 75
  • I have million of product id – Dexter1611 Aug 12 '22 at 15:10
  • I don't follow your point, @Dexter1611 . If you have more products, add more columns, but you *can't* have 1 million columns, so just add the ones you need. – Thom A Aug 12 '22 at 15:11
  • the o/p for id 3 is not matching it needs two rows getting only 1 row – Dexter1611 Aug 12 '22 at 15:44
  • every date for every product for every id needs to be present in o/p – Dexter1611 Aug 12 '22 at 15:45
  • As I said, you **can't** have millions of columns, @Dexter1611 . You can use a dynamic pivot to pivot *some* of the values, but if you are going to define what columns you need, then you are best off just using actual static SQL. Just add the columns you need for the products you want (as there is no way you want **millions** of columns). – Thom A Aug 12 '22 at 15:54
  • Honestly, if you *do* have millions of products, don't pivot your data. – Thom A Aug 12 '22 at 15:55
  • lets stick to few columns, but the o/p for ID 3 has only 1 row what I need is 2 rows – Dexter1611 Aug 12 '22 at 16:10
  • Your data and your expected output don't match, @Dexter1611 , that's not the solutions fault. You have no rows for `2019-12-31` for `Jane` in the data, and expect it in the output; it can't output a row that doesn't exist. – Thom A Aug 12 '22 at 16:13
  • @ Larnu you are correct – Dexter1611 Aug 12 '22 at 16:16