If you use the accepted answer as written now (1/6/2023), you get correct results with the OP dataset, but I think you can get wrong results with other data.
CONFIRMED: ACCEPTED ANSWER HAS A MISTAKE (as of 1/6/2023)
I explain the potential for wrong results in my comments on the accepted answer.
In this db<>fiddle, I demonstrate the wrong results. I use a slightly modified form of accepted answer (my syntax works in SQL Server and PostgreSQL). I use a slightly modified form of the OP's data (I change two rows). I demonstrate how the accepted answer can be changed slightly, to produce correct results.
The accepted answer is clever but needs a small change to produce correct results (as demonstrated in the above db<>fiddle and described here:
- Instead of doing this as seen in the accepted answer
COUNT(DISTINCT AptGroup)...
- You should do this
COUNT(DISTINCT CONCAT(AptGroup, '_', AptNumber))...
DDL:
SELECT * INTO crewWork FROM (VALUES
-- data from question, with a couple changes to demonstrate problems with the accepted answer
-- https://stackoverflow.com/q/8666295/1175496
--FloorNumber AptNumber WorkType simTime
(1, 1, 12, 10 ),
-- (1, 1, 12, 25 ), -- original
(2, 1, 12, 25 ), -- new, changing FloorNumber 1->2->1
(1, 1, 13, 35 ),
(1, 1, 13, 47 ),
(1, 2, 12, 52 ),
(1, 2, 12, 59 ),
(1, 2, 13, 68 ),
(1, 1, 14, 75 ),
(1, 4, 12, 79 ),
-- (1, 4, 12, 89 ), -- original
(1, 1, 12, 89 ), -- new , changing AptNumber 4->1->4)
(1, 4, 13, 92 ),
(1, 4, 14, 105 ),
(1, 3, 12, 115 ),
...
DML:
;
WITH groupedWithConcats as (SELECT
*,
CONCAT(AptGroup,'_', AptNumber) as AptCombo,
CONCAT(FloorGroup,'_',FloorNumber) as FloorCombo
-- SQL SERVER doesnt have TEMPORARY keyword; Postgres doesn't understand # for temp tables
-- INTO TEMPORARY groupedWithConcats
FROM
(
SELECT
-- the columns shown in Andriy's answer:
-- https://stackoverflow.com/a/8667477/1175496
ROW_NUMBER() OVER ( ORDER BY simTime) as RN,
-- AptNumber
AptNumber,
ROW_NUMBER() OVER (PARTITION BY AptNumber ORDER BY simTime) as RN_Apt,
ROW_NUMBER() OVER ( ORDER BY simTime)
- ROW_NUMBER() OVER (PARTITION BY AptNumber ORDER BY simTime) as AptGroup,
-- FloorNumber
FloorNumber,
ROW_NUMBER() OVER (PARTITION BY FloorNumber ORDER BY simTime) as RN_Floor,
ROW_NUMBER() OVER ( ORDER BY simTime)
- ROW_NUMBER() OVER (PARTITION BY FloorNumber ORDER BY simTime) as FloorGroup
FROM crewWork
) grouped
)
-- if you want to see how the groupings work:
-- SELECT * FROM groupedWithConcats
-- otherwise just run this query to see the counts of "changes":
SELECT
COUNT(DISTINCT AptCombo)-1 as CountAptChangesWithConcat_Correct,
COUNT(DISTINCT AptGroup)-1 as CountAptChangesWithoutConcat_Wrong,
COUNT(DISTINCT FloorCombo)-1 as CountFloorChangesWithConcat_Correct,
COUNT(DISTINCT FloorGroup)-1 as CountFloorChangesWithoutConcat_Wrong
FROM groupedWithConcats;
ALTERNATIVE ANSWER
The accepted-answer may eventually get updated to remove the mistake. If that happens I can remove my warning but I still want leave you with this alternative way to produce the answer.
My approach goes like this: "check the previous row, if the value is different in previous row vs current row, then there is a change". SQL doesn't have idea or row order functions per se (at least not like in Excel for example; )
Instead, SQL has window functions. With SQL's window functions, you can use the window function RANK
plus a self-JOIN
technique as seen here to combine current row values and previous row values so you can compare them. Here is a db<>fiddle showing my approach, which I pasted below.
The intermediate table, showing the columns which has a value 1 if there is a change, 0 otherwise (i.e. FloorChange
, AptChange
), is shown at the bottom of the post...
DDL:
...same as above...
DML:
;
WITH rowNumbered AS (
SELECT
*,
ROW_NUMBER() OVER (
ORDER BY simTime) as RN
FROM crewWork
)
,joinedOnItself AS (
SELECT
rowNumbered.*,
rowNumberedRowShift.FloorNumber as FloorShift,
rowNumberedRowShift.AptNumber as AptShift,
CASE WHEN rowNumbered.FloorNumber <> rowNumberedRowShift.FloorNumber THEN 1 ELSE 0 END as FloorChange,
CASE WHEN rowNumbered.AptNumber <> rowNumberedRowShift.AptNumber THEN 1 ELSE 0 END as AptChange
FROM rowNumbered
LEFT OUTER JOIN rowNumbered as rowNumberedRowShift
ON rowNumbered.RN = (rowNumberedRowShift.RN+1)
)
-- if you want to see:
-- SELECT * FROM joinedOnItself;
SELECT
SUM(FloorChange) as FloorChanges,
SUM(AptChange) as AptChanges
FROM joinedOnItself;
Below see the first few rows of the intermediate table (joinedOnItself
). This shows how my approach works. Note the last two columns, which have a value of 1 when there is a change in FloorNumber
compared to FloorShift
(noted in FloorChange
), or a change in AptNumber
compared to AptShift
(noted in AptChange
).
floornumber |
aptnumber |
worktype |
simtime |
rn |
floorshift |
aptshift |
floorchange |
aptchange |
1 |
1 |
12 |
10 |
1 |
|
|
0 |
0 |
2 |
1 |
12 |
25 |
2 |
1 |
1 |
1 |
0 |
1 |
1 |
13 |
35 |
3 |
2 |
1 |
1 |
0 |
1 |
1 |
13 |
47 |
4 |
1 |
1 |
0 |
0 |
1 |
2 |
12 |
52 |
5 |
1 |
1 |
0 |
1 |
1 |
2 |
12 |
59 |
6 |
1 |
2 |
0 |
0 |
1 |
2 |
13 |
68 |
7 |
1 |
2 |
0 |
0 |
Note instead of using the window function RANK
and JOIN
, you could use the window function LAG
to compare values in the current row to the previous row directly (no need to JOIN
). I don't have that solution here, but it is described in the Wikipedia article example:
Window functions allow access to data in the records right before and after the current record.