0

How to add multiple records skipping duplicate values but to keep tracking value starting timestamp and last timestamp of value. Explanation why: I have sensor reading values every minute. If value are the same there is no need to store same again. If I update last row when is same I lost timestamp od data change. But If I skip update if same value then I don't know if last data is real. If sensor communication is broken in my approach is clear when last data is sampled. Is that possible at all?

Here is example what I mean: Create table with:

CREATE TABLE tbase (
    ID int UNIQUE AUTO_INCREMENT,
    Sender varchar(255),
    sKey varchar(255),
    sVal varchar(255),
    ts timestamp
);

If I do this inserts:

INSERT INTO tbase (Sender,sKey,sVal,ts) VALUES ('aIn','Tmp',25,'2023-05-26 15:01:00'),('aIn','Prs',12,'2023-05-26 15:01:00');
INSERT INTO tbase (Sender,sKey,sVal,ts) VALUES ('aIn','Tmp',26,'2023-05-26 15:02:00'),('aIn','Prs',13,'2023-05-26 15:02:00');
INSERT INTO tbase (Sender,sKey,sVal,ts) VALUES ('aIn','Tmp',26,'2023-05-26 15:03:00'),('aIn','Prs',12,'2023-05-26 15:03:00');
INSERT INTO tbase (Sender,sKey,sVal,ts) VALUES ('aIn','Tmp',26,'2023-05-26 15:04:00'),('aIn','Prs',13,'2023-05-26 15:04:00');
INSERT INTO tbase (Sender,sKey,sVal,ts) VALUES ('aIn','Tmp',26,'2023-05-26 15:05:00'),('aIn','Prs',13,'2023-05-26 15:05:00');
INSERT INTO tbase (Sender,sKey,sVal,ts) VALUES ('aIn','Tmp',26,'2023-05-26 15:06:00'),('aIn','Prs',13,'2023-05-26 15:06:00');
INSERT INTO tbase (Sender,sKey,sVal,ts) VALUES ('aIn','Tmp',27,'2023-05-26 15:07:00'),('aIn','Prs',13,'2023-05-26 15:07:00');

I got result like:

ID,  Sender, sKey,  sVal, ts
"1",  "aIn", "Tmp", "25", "2023-05-26 15:01:00"
"2",  "aIn", "Prs", "12", "2023-05-26 15:01:00"
"3",  "aIn", "Tmp", "26", "2023-05-26 15:02:00"
"4",  "aIn", "Prs", "13", "2023-05-26 15:02:00"
"5",  "aIn", "Tmp", "26", "2023-05-26 15:03:00"
"6",  "aIn", "Prs", "12", "2023-05-26 15:03:00"
"7",  "aIn", "Tmp", "26", "2023-05-26 15:04:00"
"8",  "aIn", "Prs", "13", "2023-05-26 15:04:00"
"9",  "aIn", "Tmp", "26", "2023-05-26 15:05:00"
"10", "aIn", "Prs", "13", "2023-05-26 15:05:00"
"11", "aIn", "Tmp", "26", "2023-05-26 15:06:00"
"12", "aIn", "Prs", "13", "2023-05-26 15:06:00"
"13", "aIn", "Tmp", "27", "2023-05-26 15:07:00"
"14", "aIn", "Prs", "13", "2023-05-26 15:07:00"

Now I want to skip duplicated values but still want to know when value is changed AND last time if value. So if I go from clean table each step I wish to get results like this:

Wanted results per each step:

Step 1:
INSERT INTO tbase (Sender,sKey,sVal,ts) VALUES ('aIn','Tmp',25,'2023-05-26 15:01:00'),('aIn','Prs',12,'2023-05-26 15:01:00');
ID,  Sender, sKey,  sVal, ts
"1",  "aIn", "Tmp", "25", "2023-05-26 15:01:00"
"2",  "aIn", "Prs", "12", "2023-05-26 15:01:00"

Step 2: (all different)
INSERT INTO tbase (Sender,sKey,sVal,ts) VALUES ('aIn','Tmp',26,'2023-05-26 15:02:00'),('aIn','Prs',13,'2023-05-26 15:02:00');
ID,  Sender, sKey,  sVal, ts
"1",  "aIn", "Tmp", "25", "2023-05-26 15:01:00"
"2",  "aIn", "Prs", "12", "2023-05-26 15:01:00"
"3",  "aIn", "Tmp", "26", "2023-05-26 15:02:00"
"4",  "aIn", "Prs", "13", "2023-05-26 15:02:00"

Step 3: (Tmp value is same)
INSERT INTO tbase (Sender,sKey,sVal,ts) VALUES ('aIn','Tmp',26,'2023-05-26 15:03:00'),('aIn','Prs',12,'2023-05-26 15:03:00');
"1",  "aIn", "Tmp", "25", "2023-05-26 15:01:00"
"2",  "aIn", "Prs", "12", "2023-05-26 15:01:00"
"3",  "aIn", "Tmp", "26", "2023-05-26 15:02:00" <--Tmp=26 @ 2023-05-26 15:02:00 start time
"4",  "aIn", "Prs", "13", "2023-05-26 15:02:00"
"5",  "aIn", "Tmp", "26", "2023-05-26 15:03:00" <--Tmp=26 @ 2023-05-26 15:03:00 last time
"6",  "aIn", "Prs", "12", "2023-05-26 15:03:00"

Step 4: (Tmp value is same)
INSERT INTO tbase (Sender,sKey,sVal,ts) VALUES ('aIn','Tmp',26,'2023-05-26 15:04:00'),('aIn','Prs',13,'2023-05-26 15:04:00');
"1",  "aIn", "Tmp", "25", "2023-05-26 15:01:00"
"2",  "aIn", "Prs", "12", "2023-05-26 15:01:00"
"3",  "aIn", "Tmp", "26", "2023-05-26 15:02:00" <--Tmp=26 @ 2023-05-26 15:02:00 start time
"4",  "aIn", "Prs", "13", "2023-05-26 15:02:00"
"5",  "aIn", "Tmp", "26", "2023-05-26 15:04:00" <--Tmp=26 @ 2023-05-26 15:04:00 UPDATED last time
"6",  "aIn", "Prs", "12", "2023-05-26 15:03:00"
"8",  "aIn", "Prs", "13", "2023-05-26 15:04:00"

Step 5: (Tmp value is same, Prs value is same)
INSERT INTO tbase (Sender,sKey,sVal,ts) VALUES ('aIn','Tmp',26,'2023-05-26 15:05:00'),('aIn','Prs',13,'2023-05-26 15:05:00');
"1",  "aIn", "Tmp", "25", "2023-05-26 15:01:00"
"2",  "aIn", "Prs", "12", "2023-05-26 15:01:00"
"3",  "aIn", "Tmp", "26", "2023-05-26 15:02:00" <--Tmp=26 @ 2023-05-26 15:02:00 start time
"4",  "aIn", "Prs", "13", "2023-05-26 15:02:00"
"5",  "aIn", "Tmp", "26", "2023-05-26 15:05:00" <--Tmp=26 @ 2023-05-26 15:05:00 UPDATED last time
"6",  "aIn", "Prs", "12", "2023-05-26 15:03:00"
"8",  "aIn", "Prs", "13", "2023-05-26 15:04:00" <--Prs=13 @ 2023-05-26 15:04:00 start time
"10", "aIn", "Prs", "13", "2023-05-26 15:05:00" <--Prs=13 @ 2023-05-26 15:05:00 last time

Step 6: (Tmp value is same, Prs value is same)
INSERT INTO tbase (Sender,sKey,sVal,ts) VALUES ('aIn','Tmp',26,'2023-05-26 15:06:00'),('aIn','Prs',13,'2023-05-26 15:06:00');
"1",  "aIn", "Tmp", "25", "2023-05-26 15:01:00"
"2",  "aIn", "Prs", "12", "2023-05-26 15:01:00"
"3",  "aIn", "Tmp", "26", "2023-05-26 15:02:00" <--Tmp=26 @ 2023-05-26 15:02:00 start time
"4",  "aIn", "Prs", "13", "2023-05-26 15:02:00"
"5",  "aIn", "Tmp", "26", "2023-05-26 15:06:00" <--Tmp=26 @ 2023-05-26 15:06:00 UPDATED last time
"6",  "aIn", "Prs", "12", "2023-05-26 15:03:00"
"8",  "aIn", "Prs", "13", "2023-05-26 15:04:00" <--Prs=13 @ 2023-05-26 15:04:00 start time
"10", "aIn", "Prs", "13", "2023-05-26 15:06:00" <--Prs=13 @ 2023-05-26 15:06:00 UPDATED last time

Step 7: (Tmp value is different, Prs value is same)
INSERT INTO tbase (Sender,sKey,sVal,ts) VALUES ('aIn','Tmp',27,'2023-05-26 15:07:00'),('aIn','Prs',13,'2023-05-26 15:07:00');
There are two acceptable results:
This one replace last duplicated value:
"1",  "aIn", "Tmp", "25", "2023-05-26 15:01:00"
"2",  "aIn", "Prs", "12", "2023-05-26 15:01:00"
"3",  "aIn", "Tmp", "26", "2023-05-26 15:02:00" <--Tmp=26 @ 2023-05-26 15:02:00 start time
"4",  "aIn", "Prs", "13", "2023-05-26 15:02:00"
"5",  "aIn", "Tmp", "27", "2023-05-26 15:07:00" <--REPLACED Tmp value
"6",  "aIn", "Prs", "12", "2023-05-26 15:03:00"
"8",  "aIn", "Prs", "13", "2023-05-26 15:04:00" <--Prs=13 @ 2023-05-26 15:04:00 start time
"10", "aIn", "Prs", "13", "2023-05-26 15:07:00" <--Prs=13 @ 2023-05-26 15:07:00 UPDATED last time

or adding new row and leave last updated in place
"1",  "aIn", "Tmp", "25", "2023-05-26 15:01:00"
"2",  "aIn", "Prs", "12", "2023-05-26 15:01:00"
"3",  "aIn", "Tmp", "26", "2023-05-26 15:02:00" <--Tmp=26 @ 2023-05-26 15:02:00 start time
"4",  "aIn", "Prs", "13", "2023-05-26 15:02:00"
"5",  "aIn", "Tmp", "26", "2023-05-26 15:06:00" <--Tmp=26 @ 2023-05-26 15:06:00 end time
"6",  "aIn", "Prs", "12", "2023-05-26 15:03:00"
"8",  "aIn", "Prs", "13", "2023-05-26 15:04:00" <--Prs=13 @ 2023-05-26 15:04:00 start time
"10", "aIn", "Prs", "13", "2023-05-26 15:07:00" <--Prs=13 @ 2023-05-26 15:07:00 UPDATED last time
"13", "aIn", "Tmp", "27", "2023-05-26 15:07:00" <--New TMP value as different
eSlavko
  • 348
  • 3
  • 12

1 Answers1

0

You can use insert ... on duplicate key for this.

This requires that your have a unique (or PK) constraint on tuple of columns (Sender,sKey, sVal). Then you can phrase the query as:

insert into base (sender, skey, sval) 
values ('ain', 'tmp', 25)
on duplicate key update timestamp = current_timestamp
GMB
  • 216,147
  • 25
  • 84
  • 135
  • Doesn't work. Actual line I got from my IOT is: INSERT INTO IoTData (Sender,sKey,sVal,timestamp) VALUES ('KupolaIn','Temp',26.4,'2023-05-26 10:16:12'),('KupolaIn','Pressure',97333,'2023-05-26 10:16:12') and I added on duplicate... but still got duplicates on db – eSlavko May 26 '23 at 10:17
  • ..so there is timestamp too.. – eSlavko May 26 '23 at 10:19
  • ..and as I understand "on duplicate" will remove information when value is changed 1st time. – eSlavko May 26 '23 at 10:21
  • 1
    If you still get a duplicate, you missed to to what GMB wrote in his 2nd sentence: add a unique key. This is the requirement for on duplicate key to work. (And without on duplicate key, you would get an error, but you would never get a duplicate). – Solarflare May 26 '23 at 10:31
  • My question is very similar with proposed duplicate. The difference is that when data is same I want to have record when data come to this value, and as last row I want to have record "when" data is still at same value. UPDATE statement just skip adding last value or update value in database and lost track when 1st time appear. – eSlavko May 26 '23 at 12:49