1

I have a table below with sample data.

Date&Time Message
13/07/2022 8:59:09 Normal
13/07/2022 8:59:10 Normal
13/07/2022 8:59:11 Normal
13/07/2022 8:59:12 Warning
13/07/2022 8:59:13 Warning
13/07/2022 8:59:14 Warning
13/07/2022 8:59:15 Warning
13/07/2022 8:59:16 Error
13/07/2022 8:59:17 Error
13/07/2022 8:59:18 Warning
13/07/2022 8:59:19 Warning
13/07/2022 8:59:20 Warning
13/07/2022 8:59:21 Normal
13/07/2022 8:59:22 Normal
13/07/2022 8:59:23 Normal
13/07/2022 8:59:24 Warning
13/07/2022 8:59:25 Warning
13/07/2022 8:59:26 Warning
13/07/2022 8:59:27 Error
13/07/2022 8:59:28 Error

I need to write a Postgres query which will give me output like:

Start Date&Time End Date&Time
13/07/2022 8:59:12 13/07/2022 8:59:15
13/07/2022 8:59:24 13/07/2022 8:59:26

The scenario is - if there is any 'Error' message, I need to take the start time of 'Warning' and end time of Warning, and if there is no 'Error' message after 'Warning' ignore it e.g. after 13/07/2022 8:59:20 Warning' there is no error, so query should ignore that range. Please help to find a query for this.

Setup table queries:

CREATE TABLE test_data (
  id integer PRIMARY KEY
, message VARCHAR(10)
, datetimestamp timestamp without time zone NOT NULL
);

SET datestyle = 'DMY';
INSERT INTO test_data VALUES
  (09, 'Normal' , '13/07/2022 8:59:09')
, (10, 'Normal' , '13/07/2022 8:59:10')
, (11, 'Normal' , '13/07/2022 8:59:11')
, (12, 'Warning', '13/07/2022 8:59:12')
, (13, 'Warning', '13/07/2022 8:59:13')
, (14, 'Warning', '13/07/2022 8:59:14')
, (15, 'Warning', '13/07/2022 8:59:15')
, (16, 'Error'  , '13/07/2022 8:59:16')
, (17, 'Error'  , '13/07/2022 8:59:17')
, (18, 'Warning', '13/07/2022 8:59:18')
, (19, 'Warning', '13/07/2022 8:59:19')
, (20, 'Warning', '13/07/2022 8:59:20')
, (21, 'Normal' , '13/07/2022 8:59:21')
, (22, 'Normal' , '13/07/2022 8:59:22')
, (23, 'Normal' , '13/07/2022 8:59:23')
, (24, 'Warning', '13/07/2022 8:59:24')
, (25, 'Warning', '13/07/2022 8:59:25')
, (26, 'Warning', '13/07/2022 8:59:26')
, (27, 'Error'  , '13/07/2022 8:59:27')
, (28, 'Error'  , '13/07/2022 8:59:28')
;
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
RKIDEV
  • 119
  • 7
  • Aside: Use unambiguous ISO format wherever possible. See: https://stackoverflow.com/a/69451546/939860, https://stackoverflow.com/a/12426809/939860 – Erwin Brandstetter Jul 29 '22 at 10:48

2 Answers2

1

Please refer to the below script.

WITH cte AS
(
         SELECT   id,
                  message,
                  datetimestamp,
                  Extract( second FROM datetimestamp )::bigint - rank() over(partition BY message ORDER BY datetimestamp) AS gap
         FROM     (
                         SELECT id,
                                message,
                                datetimestamp
                         FROM   (
                                         SELECT   id,
                                                  message,
                                                  datetimestamp ,
                                                  lag(message) OVER (ORDER BY datetimestamp) ,
                                                  lead(message) OVER (ORDER BY datetimestamp) ,
                                                  lag(message,1) OVER (ORDER BY datetimestamp) IN ('Normal',
                                                                                                   'Warning') AS step1 ,
                                                  lead(message,1) OVER (ORDER BY datetimestamp) IN ('Warning',
                                                                                                    'Error') AS step2 ,
                                                  lag(message,2) OVER (ORDER BY datetimestamp) IN ('Normal',
                                                                                                   'Warning') AS step3 ,
                                                  lead(message,2) OVER (ORDER BY datetimestamp) IN ('Warning',
                                                                                                    'Error') AS step4
                                         FROM     test_data
                                         ORDER BY datetimestamp) sub1
                         WHERE  step1
                         AND    step2
                         AND    step3
                         AND    step4
                         AND    message = 'Warning' ) sub )
SELECT   gap,
         min(datetimestamp),
         max(datetimestamp)
FROM     cte
GROUP BY gap

There is one sample

Courser Xu
  • 142
  • 4
  • thank you for reply. Along with the start and end datetime I need to fetch two more column details(Posted separately - https://stackoverflow.com/questions/73176222/postgres-find-start-and-end-date-time-and-corresponding-column-value-based-on-co). Is there any way to achieve this ? – RKIDEV Jul 30 '22 at 14:05
1

This is a typical problem.

SELECT min(datetimestamp) FILTER (WHERE message = 'Warning') AS start_warning
     , max(datetimestamp) FILTER (WHERE message = 'Warning') AS end_warning
FROM  (
   SELECT *
        , count(*) FILTER (WHERE message = 'Warning' AND last_msg <> 'Warning')
                   OVER (ORDER BY datetimestamp) AS grp
   FROM  (
      SELECT datetimestamp, message
           , lag(message) OVER (ORDER BY datetimestamp) AS last_msg
      FROM   test_data
      ) sub1
   WHERE  message IN ('Warning', 'Error')
   ) sub2
GROUP  BY grp
HAVING bool_or(message = 'Error' AND last_msg = 'Warning');

db<>fiddle here

The query starts a new group whenever a 'Warning' is preceded by something else. And it only reports times when the 'Warning' block is followed by an 'Error'. You can have any number of different message types (except for NULL), these are the only conditions.

The expensive part is the window function. (In particular when using different window frames.) This query makes do with a single window function and a single sort order. So it should be as fast (and simple) as it gets.

Related:

About the aggregate FILTER:

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • thank you for reply. Along with the start and end datetime I need to fetch two more column details(Posted separately - https://stackoverflow.com/questions/73176222/postgres-find-start-and-end-date-time-and-corresponding-column-value-based-on-co). Is there any way to achieve this ? – RKIDEV Jul 30 '22 at 14:06