2

I have table with similar case number for different products. Hence I want to find the the case number has been added, removed and existing according to the date. I have a data something like below

Date        CapVersion CaseNumber 
----------------------------------
2020-12-02  CAP A      1002 
2020-12-02  CAP A      1003 
2020-12-02  CAP A      1004 
2020-12-11  CAP B      1002 
2020-12-11  CAP B      1003 
2020-12-14  CAP C      1003
2020-12-14  CAP C      1004 
2020-12-14  CAP C      1005 
2020-12-15  CAP D      1005 
2020-12-15  CAP D      1007

Output I want

Date    CapVersion  CaseNumberAdded CaseNumberRemoved   CaseNumberExisting
--------------------------------------------------------------------------
2020-12-02  CAP A   0               0                   3
2020-12-11  CAP B   0               1                   2
2020-12-14  CAP C   2               1                   1
2020-12-14  CAP D   1               2                   1

Output I get right now:

Date    CapVersion  CaseNumberAdded CaseNumberRemoved   CaseNumberExisting
--------------------------------------------------------------------------
2020-12-02  CAP A   0               0                   3
2020-12-11  CAP B   0               1                   2
2020-12-14  CAP C   1               1                   2
2020-12-15  CAP D   2               3                   0

So the code should take data from 2020-12-02 compare with data from 2020-12-11 and then it needs to compare data from 2020-12-11 with data from 2020-12-14 but what my current code is doing it compare data from 2020-12-02 with data from 2020-12-11 and then again it compare 2020-12-02 with 2020-12-14.

I will provide a fiddle and a code below.

Open the fiddle link you will get full access to my code https://dbfiddle.uk/?rdbms=sqlserver_2014&fiddle=426aaaf1b851462d975909e56a08d4be

--this is the my error code
WITH BaseDate AS 
(
     SELECT MIN(DataTable.Date) BaseDate
     FROM DataTable 
),
Dates AS 
(
     SELECT DISTINCT DataTable.Date, DataTable.CapVersion
     FROM DataTable 
),
BaseData AS 
(
     SELECT Dates.Date, DataTable.CapVersion, DataTable.CaseNumber
     FROM DataTable
     CROSS JOIN Dates
     JOIN BaseDate ON DataTable.Date = BaseDate.BaseDate 
),
Detailed AS 
(
     SELECT 
         COALESCE(DataTable.Date, BaseData.Date) Date, 
         COALESCE(DataTable.CaseNumber, BaseData.CaseNumber) CaseNumber,
         CASE 
            WHEN DataTable.CaseNumber IS NOT NULL AND BaseData.CaseNumber IS NULL
               THEN 1
               ELSE 0 
         END CaseNumberAdded,
         CASE 
            WHEN DataTable.CaseNumber IS NULL AND BaseData.CaseNumber IS NOT NULL
               THEN 1
               ELSE 0 
         END CaseNumberRemoved,
         CASE 
            WHEN DataTable.CaseNumber IS NOT NULL AND BaseData.CaseNumber IS NOT NULL
               THEN 1
               ELSE 0 
         END CaseNumberExisting
     FROM 
         BaseData
     FULL JOIN 
         DataTable ON DataTable.CaseNumber = BaseData.CaseNumber
                   AND DataTable.Date = BaseData.Date 
)
SELECT 
    Detailed.Date,
    Dates.CapVersion,
    SUM(Detailed.CaseNumberAdded) CaseNumberAdded,
    SUM(Detailed.CaseNumberRemoved) CaseNumberRemoved,
    SUM(Detailed.CaseNumberExisiting) CaseNumberExisting
FROM 
    Detailed
JOIN 
    Dates ON Detailed.Date = Dates.Date
GROUP BY Detailed.Date, Dates.CapVersion;
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
David Lee
  • 61
  • 1
  • 6
  • 1
    Is CapVersion some sort of auto-incrementing id that corresponds with date? Meaning...does the column actually affect the grouping logic? – Chad Baldwin Dec 17 '20 at 07:11
  • @ChadBaldwin yes auto-incrementing id that corresponds with date – David Lee Dec 17 '20 at 07:15
  • Dear @ChadBaldwin you may come up with entire new solution based on my desire output it will be very much helpful to me – David Lee Dec 17 '20 at 07:37
  • I think for `2020-12-02` all three entries should be considered as added because nothing exists before this date, so all entries on this date are new, right? – Giorgos Betsos Dec 17 '20 at 08:15
  • @GiorgosBetsos no I want the take the earliest data as the existing data if you provide me a solution based on that I will be a help for me – David Lee Dec 17 '20 at 08:46

1 Answers1

3

I think this should do it:

;WITH DataTableBase AS 
(
    SELECT Date, CapVersion, ROW_NUMBER() OVER (ORDER BY Date) AS seq
    FROM DataTable
    GROUP BY Date, CapVersion
), DataTableSections AS 
( 
    SELECT Date, CapVersion, CaseNumber, DENSE_RANK() OVER (ORDER BY Date) AS seq   
    FROM DataTable  
), DataTableCombined AS
(
    SELECT dt1.seq, dt2.seq AS seqBefore
       , dt1.CaseNumber, dt2.CaseNumber AS CaseNumberBefore
       , dt1.Date, dt1.CapVersion, dt2.Date AS DateBefore
       , dt2.CapVersion AS CapVersionBefore
    FROM DataTableSections AS dt1
    FULL OUTER JOIN DataTableSections AS dt2 
       ON dt1.seq = dt2.seq + 1 AND dt1.CaseNumber = dt2.CaseNumber
)    
SELECT dt.seq
    , dt.Date
    , dt.CapVersion 
    , COUNT(CASE WHEN dt.seq != 1 AND CaseNumberBefore IS NULL THEN 1 END) AS CaseNumberAdded
    , COUNT(CASE WHEN CaseNumber IS NULL THEN 1 END) AS CaseNumberRemoved
    , COUNT(CASE WHEN (dt.seq = 1) OR (CaseNumber IS NOT NULL AND CaseNumberBefore IS NOT NULL) THEN 1 END) AS CaseNumberExisiting
FROM DataTableBase AS dt
INNER JOIN DataTableCombined AS dtc ON dt.seq = COALESCE(dtc.seq, dtc.seqBefore + 1)
GROUP BY dt.seq, dt.Date, dt.CapVersion

DB Fiddle Demo

Giorgos Betsos
  • 71,379
  • 9
  • 63
  • 98
  • [When should I use semicolons in SQL Server?](https://stackoverflow.com/questions/710683/when-should-i-use-semicolons-in-sql-server) – Luuk Dec 17 '20 at 12:02