1

Taking it a step further than in my previous question: I have a table below with sample data.

datetimestamp message server system
2022-07-13 08:59:09 Normal Server 1 System 1
2022-07-13 08:59:10 Normal Server 4 System 2
2022-07-13 08:59:11 Normal Server 3 System 3
2022-07-13 08:59:12 Warning Server 35 System 27
2022-07-13 08:59:13 Warning Server 5 System 5
2022-07-13 08:59:14 Warning Server 9 System 6
2022-07-13 08:59:15 Warning Server 8 System 7
2022-07-13 08:59:16 Error Server 12 System 8
2022-07-13 08:59:17 Error Server 15 System 9
2022-07-13 08:59:18 Warning Server 29 System 10
2022-07-13 08:59:19 Warning Server 22 System 11
2022-07-13 08:59:20 Warning Server 13 System 12
2022-07-13 08:59:21 Normal Server 16 System 13
2022-07-13 08:59:22 Normal Server 19 System 14
2022-07-13 08:59:23 Normal Server 21 System 15
2022-07-13 08:59:24 Warning Server 24 System 16
2022-07-13 08:59:25 Warning Server 27 System 17
2022-07-13 08:59:26 Warning Server 25 System 18
2022-07-13 08:59:27 Error Server 30 System 23
2022-07-13 08:59:28 Error Server 31 System 20

I need to write a Postgres query which will give me output like:

Start_Datetime End_Datetime Server_Start Server_End System_Start System_End
2022-07-13 08:59:12 2022-07-13 08:59:15 Server 35 Server 8 System 27 System 7
2022-07-13 08:59:24 2022-07-13 08:59:26 Server 24 Server 25 System 16 System 18

If there is any 'Error' message, I need to take the start time of 'Warning' and end time of 'Warning' - and corresponding server and system! If there is no 'Error' message after 'Warning' ignore it e.g. after the warning at 2022-07-13 08:59:20 there is no error, so the query should ignore that range.

Setup table queries:

CREATE TABLE test_data (
  id integer PRIMARY KEY
, message varchar(10)
, datetimestamp timestamp NOT NULL
, server varchar(10)
, system varchar(10)
);

INSERT INTO test_data VALUES
  (09, 'Normal' , '2022-07-13 08:59:09', 'Server 1' , 'System 1')
, (10, 'Normal' , '2022-07-13 08:59:10', 'Server 4' , 'System 2')
, (11, 'Normal' , '2022-07-13 08:59:11', 'Server 3' , 'System 3')
, (12, 'Warning', '2022-07-13 08:59:12', 'Server 35', 'System 27')
, (13, 'Warning', '2022-07-13 08:59:13', 'Server 5' , 'System 5')
, (14, 'Warning', '2022-07-13 08:59:14', 'Server 9' , 'System 6')
, (15, 'Warning', '2022-07-13 08:59:15', 'Server 8' , 'System 7')
, (16, 'Error'  , '2022-07-13 08:59:16', 'Server 12', 'System 8')
, (17, 'Error'  , '2022-07-13 08:59:17', 'Server 15', 'System 9')
, (18, 'Warning', '2022-07-13 08:59:18', 'Server 29', 'System 10')
, (19, 'Warning', '2022-07-13 08:59:19', 'Server 22', 'System 11')
, (20, 'Warning', '2022-07-13 08:59:20', 'Server 13', 'System 12')
, (21, 'Normal' , '2022-07-13 08:59:21', 'Server 16', 'System 13')
, (22, 'Normal' , '2022-07-13 08:59:22', 'Server 19', 'System 14')
, (23, 'Normal' , '2022-07-13 08:59:23', 'Server 21', 'System 15')
, (24, 'Warning', '2022-07-13 08:59:24', 'Server 24', 'System 16')
, (25, 'Warning', '2022-07-13 08:59:25', 'Server 27', 'System 17')
, (26, 'Warning', '2022-07-13 08:59:26', 'Server 25', 'System 18')
, (27, 'Error'  , '2022-07-13 08:59:27', 'Server 30', 'System 23')
, (28, 'Error'  , '2022-07-13 08:59:28', 'Server 31', 'System 20')
;
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
RKIDEV
  • 119
  • 7
  • Is an error always rized after warning or it could be rized after normal state? – ahmed Jul 30 '22 at 18:04
  • Hi @Ahmed ‘Error’ comes always after Warning but sometimes the server can go to ‘Normal’ after ‘Warning’ state. But yeah ‘Error’ comes after ‘Warning’ only. – RKIDEV Jul 31 '22 at 01:09
  • Hi @Parfait this is almost same as my another question - https://stackoverflow.com/questions/73150635/get-start-and-end-date-time-based-on-based-on-sequence-of-rows?noredirect=1#comment129196189_73150635 . Only difference is that here I have 2 more columns. – RKIDEV Jul 31 '22 at 01:16

3 Answers3

1

Elaborating on my answer to your previous question, this does the job:

WITH cte AS (
   SELECT grp, datetimestamp, message, server, system
        , bool_or(message = 'Error' AND last_msg = 'Warning') OVER (PARTITION BY grp) AS report  -- qualifies for report
   FROM  (
      SELECT *
           , count(*) FILTER (WHERE message = 'Warning' AND last_msg <> 'Warning') OVER (ORDER BY datetimestamp) AS grp
      FROM  (
         SELECT datetimestamp, message, server, system
              , lag(message) OVER (ORDER BY datetimestamp) AS last_msg
         FROM   test_data
         ) sub1
      WHERE  message IN ('Warning', 'Error')  -- trim noise early
      ) sub2
   )
SELECT a.datetimestamp AS start_time, z.datetimestamp AS end_time
     , a.server AS start_server, z.server AS end_server
     , a.system AS start_system, z.system AS end_system
FROM  (
   SELECT DISTINCT ON (grp) *
   FROM   cte
   WHERE  report
   ORDER  BY grp, datetimestamp
   ) a
JOIN  (
   SELECT DISTINCT ON (grp) *
   FROM   cte
   WHERE  report
   AND    message = 'Warning'  -- exclude trailing errors
   ORDER  BY grp, datetimestamp DESC
   ) z USING (grp);

db<>fiddle here

Produces your desired result exactly.

This relies on unique timestamps. If there can be duplicates, work with id additionally (or break ties some other way).

Move the base query from the previous answer to a CTE, and add the tag report to mark qualifying rows - we are going to reference it multiple times.

In the outer query, use two subqueries with DISTINCT ON to retrieve first and last row per group, then join ...

See:

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
1

You may try the following:

With cte As 
(
  Select D.id,D.message,D.ld,D.lg,
  Coalesce(Lag(D.id) over (order by D.datetimestamp,D.id),D.id) lastWarningId,
  Coalesce(Lag(D.id, 2) over (order by D.datetimestamp,D.id),
           Lag(D.id) over (order by D.datetimestamp,D.id)) firstWarningId
  From 
  (
     Select id, message,datetimestamp, 
            Coalesce(Lead(message) Over (Order By datetimestamp,id) , 'last') as ld,
            Coalesce(lag(message) Over (Order By datetimestamp,id) , 'first') as lg
     From test_data) D
     Where D.message<> D.ld or D.message <> D.lg
 ),
cte2 As
(
  Select id,message,ld,lg, lastWarningId, 
         Case When (Select Count(*) From test_data T Where T.id Between firstWarningId and lastWarningId and T.message='Warning')=1
         Then
              lastWarningId 
         Else 
              firstWarningId 
         End as firstwarningid2
  From cte
 )
Select T.datetimestamp as Start_DateTime, T2.datetimestamp as End_DateTime, 
       T.server as Server_Start, T2.server as Server_End,
       T.system as System_Start, T2.system as System_End
  From cte2 C
  Join test_data T On T.id = C.firstwarningid2
  Join test_data T2 On T2.id = C.lastwarningid
  Where C.message='Error' And C.lg='Warning'

See a demo from uk<>fiddle.

The idea of this query is to find the first and last id for each group of messages, which is done in cet2. Where:

Lag(id) over (order by datetimestamp,id) is the lastid of each group.

Lag(id, 2) over (order by datetimestamp,id) is the first id of each group.

Then for each group of message='Error' find the first and last id of the nearest group of message='Warning', which done by Where C.message='Error' And C.lg='Warning'.

To get more details about how it works, try to select * from cet2 Where message='Error' And lg='Warning'.

ahmed
  • 9,071
  • 3
  • 9
  • 22
0

This is a kind of gaps and islands problem. First detect groups of messages and then find the lead / lag group kind. Take the combinations needed. I'm using arrays to collect data ordered by datetimestamp so first/last item are the items needed.

with mg as(
  select message, count(*) n
    , min(datetimestamp) start_time
    , max(datetimestamp) end_time
    , array_agg(server order by datetimestamp) asrv
    , array_agg(system order by datetimestamp) asys
    , lag(message, 1, 'Normal') over(order by min(datetimestamp)) prevmsg
    , lead(message,1, 'Error') over(order by min(datetimestamp)) nxtmsg
  from (
    select *, row_number() over(order by datetimestamp) - row_number() over(partition by message order by datetimestamp) grp
    from test_data
  ) t
  group by message, grp
)
select start_time, end_time
  , asrv[1] start_server, asrv[n] end_server
  , asys[1] start_system, asys[n] end_system
from mg
where mg.message ='Warning' and mg.prevmsg='Normal' and mg.nxtmsg='Error'

db_fiddle

Serg
  • 22,285
  • 5
  • 21
  • 48