2

Goal: For each "IDCONT", i need to get the "DAY_ID" where i have the last change/update on "STATE_ID".

Example:

with reftable as (
 select 1 as PROCESSID, 'A' as IDCONT, 'X' as STATEID, '10' AS DAY_ID union all
 select 2 as PROCESSID, 'A' as IDCONT, 'X' as STATEID, '11' AS DAY_ID union all
 select 3 as PROCESSID, 'A' as IDCONT, 'Y' as STATEID, '12' AS DAY_ID union all
 select 4 as PROCESSID, 'A' as IDCONT, 'Y' as STATEID, '13' AS DAY_ID union all

 select 1 as PROCESSID, 'B' as IDCONT, 'N' as STATEID, '14' AS DAY_ID union all
 select 2 as PROCESSID, 'B' as IDCONT, 'N' as STATEID, '15' AS DAY_ID union all
 select 3 as PROCESSID, 'B' as IDCONT, 'M' as STATEID, '16' AS DAY_ID union all

 select 1 as PROCESSID, 'C' as IDCONT, 'X' as STATEID, '11' AS DAY_ID union all
 select 2 as PROCESSID, 'C' as IDCONT, 'X' as STATEID, '18' AS DAY_ID union all
) ...

Expected result:

PROCESSID   IDCONT   STATID   DAYID
3           A        Y        12
2           B        N        15
1           C        X        11        

I solved the problem with this:

...
SELECT IDCONT, STATEID, MIN(DAY_ID)
FROM REFTABLE 
WHERE (IDCONT, STATEID) IN (
   SELECT IDCONT, FIRST_VALUE(STATEID) OVER PARTITION BY IDCONT ORDER BY PROCESSID DESC) AS STATEID
   FROM REFTABLE
)

But i want to do the same without the need to call the table a 2nd time.

Thx!

reiver
  • 117
  • 1
  • 8

3 Answers3

1

It would be simpler if you didn't need to return IDCONT whose STATEID didn't change (that would be a C). One REFTABLE trip might look like this; see if it does any good in reality.

SQL> with reftable as (
  2   select 1 as PROCESSID, 'A' as IDCONT, 'X' as STATEID, '10' AS DAY_ID from dual union all
  3   select 2 as PROCESSID, 'A' as IDCONT, 'X' as STATEID, '11' AS DAY_ID from dual union all
  4   select 3 as PROCESSID, 'A' as IDCONT, 'Y' as STATEID, '12' AS DAY_ID from dual union all
  5   select 4 as PROCESSID, 'A' as IDCONT, 'Y' as STATEID, '13' AS DAY_ID from dual union all
  6   --
  7   select 1 as PROCESSID, 'B' as IDCONT, 'N' as STATEID, '14' AS DAY_ID from dual union all
  8   select 2 as PROCESSID, 'B' as IDCONT, 'N' as STATEID, '15' AS DAY_ID from dual union all
  9   select 3 as PROCESSID, 'B' as IDCONT, 'M' as STATEID, '16' AS DAY_ID from dual union all
 10   --
 11   select 1 as PROCESSID, 'C' as IDCONT, 'X' as STATEID, '11' AS DAY_ID from dual union all
 12   select 2 as PROCESSID, 'C' as IDCONT, 'X' as STATEID, '18' AS DAY_ID from dual
 13  ),
 14  inter as
 15    (select processid, idcont, stateid, day_id,
 16            case when nvl(lag(stateid) over
 17                            (partition by idcont order by processid  ), '?') <> stateid then
 18                      row_number() over (partition by idcont order by processid )
 19            end grp
 20     from reftable
 21    )
 22  select processid, idcont, stateid, day_id
 23  from inter i
 24  where grp = (select max(i1.grp)
 25              from inter i1
 26              where i1.idcont = i.idcont)
 27  order by idcont, processid;

 PROCESSID IDCONT     STATEID    DAY_ID
---------- ---------- ---------- ----------
         3 A          Y          12
         3 B          M          16
         1 C          X          11

SQL>
Littlefoot
  • 131,892
  • 15
  • 35
  • 57
1

Here is one method:

select r.*
from (select r.*,
             lag(stateid) over (partition by idcont order by day_id) as prev_stateid,
             first_value(stateid) over (partition by idcont order by day_id desc) as last_stateid
      from reftable r
     ) r
where stateid = last_stateid and (prev_stateid is null or prev_stateid <> stateid);

However, this does not handle the case where the state changes back to a previous state. That logic can be added in, if necessary.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
0

Using "FIRST_VALUE" does give you the change, but can you rely on there only being a single change in the table? Won't other changes invalidate this and result in bad data?

I used the LAG function instead but it doesn't return IDCONT C because it has not had a change.

Using a CTE to grab the data and then a query to filter might be faster (since you can't put LAG or FIRST_VALUE in the where clause). It would prevent another trip to the database.

CREATE TABLE REFTABLE
    ([PROCESSID] int, [IDCONT] varchar(1), [STATEID] varchar(1), [DAY_ID] int)
;

INSERT INTO REFTABLE
    ([PROCESSID], [IDCONT], [STATEID], [DAY_ID])
VALUES
    (1, 'A', 'X', 10),
    (2, 'A', 'X', 11),
    (3, 'A', 'Y', 12),
    (4, 'A', 'Y', 13),
    (1, 'B', 'N', 14),
    (2, 'B', 'N', 15),
    (3, 'B', 'M', 16),
    (1, 'C', 'X', 11),
    (2, 'C', 'X', 18)
;

with chgfound as (SELECT TOP 100 PERCENT PROCESSID, IDCONT, STATEID, DAY_ID, LAG(STATEID) OVER(PARTITION BY IDCONT ORDER BY IDCONT, PROCESSID) as LastState
from REFTABLE
order by IDCONT, PROCESSID
)
select * from chgfound where STATEID !=LastState

http://www.sqlfiddle.com/#!18/086134

Also just noticed you have the Oracle tag. I did this in SQL Server but it has to be really close to the same.

KingOfAllTrades
  • 398
  • 1
  • 11