How to delete duplicates data that is in between two common value?

Question

How can I delete duplicate data based on the common value (Start and End)

(Time is unique key)

My table is:

Time	Data
10:24:11	Start
10:24:12	Result
10:24:13	Result
10:24:14	End
10:24:15	Start
10:24:16	Result
10:24:17	End

I want to get Data: Result in between Start and End that is with the MAX(TIME) when duplication does occur. as such

The result that I want:

Time	Data
10:24:11	Start
10:24:13	Result
10:24:14	End
10:24:15	Start
10:24:16	Result
10:24:17	End

I have tried rearranging the data, but couldn't seems to get the result that I want, Could someone give their advice on this case?

Update

I ended up not using either of the the approach suggested by @fredt and @airliquide as my version of HSQLDB doesn't support the function.

so what I did was, adding sequence and making Start = 1, Result = 2, and End = 3.

Sequence	Time	Data	Indicator
1	10:24:11	Start	1
2	10:24:12	Result	2
3	10:24:13	Result	2
4	10:24:14	End	3
5	10:24:15	Start	1
6	10:24:16	Result	2
7	10:24:17	End	3

Thereon, I make use of the indicator and sequence to get only latest Result. Such that if previous row is 2 (which is result), remove it.

The guide that I follow: From: Is there a way to access the "previous row" value in a SELECT statement?

select t1.value - t2.value from table t1, table t2 
where t1.primaryKey = t2.primaryKey - 1

score 0 · Answer 1 · answered Dec 20 '21 at 15:03

Hi a first approach will be to use a lead function as folow

 select hour,status from (select *,lead(status,1) over ( order by hour) as lead
from newtable)compare 
where compare.lead <> status
OR lead is null

Give me what's expected using a postgres engine.

fredt · Answer 2 · 2021-12-20T18:11:29.617

You can do this sort of thing with SQL procedures.

-- create the table with only two columns
CREATE TABLE actions (attime TIME UNIQUE, data VARCHAR(10));

-- drop the procedure if it exists
DROP PROCEDURE del_duplicates IF EXISTS;

create procedure del_duplicates()  MODIFIES SQL DATA begin atomic
DECLARE last_time time(0) default null;
 for_loop:
 -- loop over the rows in order
 FOR SELECT * FROM actions ORDER BY attime DO

  -- each time 'Start' is found, clear the last_time variable
  IF data = 'Start' THEN
   SET last_time = NULL; 
   ITERATE for_loop;
  END IF;

  -- each time 'Result' is found, delete the row with previous time
  -- if last_time is null, no row is actually deleted 
  IF data = 'Result' THEN
   DELETE FROM actions WHERE attime = last_time;
   -- then store the latest time
   SET last_time = attime; 
   ITERATE for_loop;
  END IF;
        
END FOR;        
END

Your data must all belong to a single day, otherwise there will be strange overlaps that cannot be distinguished. It is better to use TIMESTAMP instead of TIME.

How to delete duplicates data that is in between two common value?

2 Answers2