0

I've tried to delete unnecessary data from a table, but it hasn't worked for a different date.
I got the query from here
This is my query:

WITH date AS
(SELECT ID, 
MIN(time) AS minDate, 
MAX(time) AS maxDate
FROM tablename
GROUP BY ID)
DELETE FROM tablename
  WHERE 
    NOT EXISTS 
      (SELECT 1 FROM tablename
        WHERE tablename.ID = d.ID 
          AND tablename.time IN (d.minDate, d.maxDate));

This is the table:

ID name time date dir
4 Ety 07:51:48 2023-07-20 in
4 Ety 17:04:07 2023-07-20 out
4 Ety 07:50:48 2023-07-20 in
4 Ety 17:08:07 2023-07-21 out
4 Ety 07:08:07 2023-07-21 in
6 Herry 07:26:03 2023-07-20 in
6 Herry 07:16:11 2023-07-20 in
6 Herry 17:26:11 2023-07-20 out
6 Herry 06:26:11 2023-07-21 in
6 Herry 07:12:11 2023-07-21 in
6 Herry 17:26:11 2023-07-21 out
7 Martha 07:50:23 2023-07-20 in
7 Martha 17:04:43 2023-07-20 out
7 Martha 07:50:24 2023-07-21 in
7 Martha 17:04:44 2023-07-21 out
8 Martha 17:14:45 2023-07-21 out

Expected table:

ID name time date dir
4 Ety 07:51:48 2023-07-20 in
4 Ety 17:04:07 2023-07-20 out
4 Ety 07:08:07 2023-07-21 in
4 Ety 17:08:07 2023-07-21 out
6 Herry 07:16:11 2023-07-20 in
6 Herry 17:26:11 2023-07-20 out
6 Herry 06:26:11 2023-07-21 in
6 Herry 17:26:11 2023-07-21 out
7 Martha 07:50:24 2023-07-20 in
7 Martha 17:04:44 2023-07-20 out
7 Martha 07:50:24 2023-07-21 in
7 Martha 17:14:45 2023-07-21 out
Dale K
  • 25,246
  • 15
  • 42
  • 71
mycode binary
  • 27
  • 1
  • 6
  • 1
    Those MD formatted tables really waste screen height... – jarlh Jul 25 '23 at 13:46
  • Using `In (minDate, maxDate)` - that is only going to delete two dates. You probably (?) intend to do a range (such as `Between minDate and maxDate`) – topsail Jul 25 '23 at 13:46
  • 1
    ya ive edited my query, sorry – mycode binary Jul 25 '23 at 13:47
  • 1
    That your dates have different formats is a *massive* issue; this suggests your column `date` is actually *not* a `date` but a `(n)varchar`; that's a severe design flaw. – Thom A Jul 25 '23 at 13:52
  • ok,sorry ive been edited yet... – mycode binary Jul 25 '23 at 13:55
  • 1
    It is bad practice to save date and time in separate columns. Why are you doing that? What is the logic here that tells you which rows should be deleted? You showed sample data and the expected result after the deletion, but you didn't explain it – Jonas Metzler Jul 25 '23 at 14:20

1 Answers1

0

First, let's take a look at your query:

WITH date AS
(SELECT ID, 
MIN(time) AS minDate, 
MAX(time) AS maxDate
FROM tablename
GROUP BY ID)
DELETE FROM tablename
  WHERE 
    NOT EXISTS 
      (SELECT 1 FROM tablename
        WHERE tablename.ID = d.ID 
          AND tablename.time IN (d.minDate, d.maxDate));

Let's state what this means with plain words:

"Let's get the ID, minDate and maxDate for each ID group from tablename and then, using this result, let's remove all records whose ID matches one of the group, but whose time differs both minDate and maxDate"

The flaw in the logic is that your intended query would translate to this:

"Let's get the ID, minDate and maxDate for each ID group from tablename and then, using this result, let's remove all records whose ID matches one of the group, but whose time is outside of the range of minDate and maxDate"

So, let's change the operand of the IN known for set theory to ranges:

WITH date AS
(SELECT ID, 
MIN(time) AS minDate, 
MAX(time) AS maxDate
FROM tablename
GROUP BY ID)
DELETE FROM tablename
  WHERE 
    NOT EXISTS 
      (SELECT 1 FROM tablename
        WHERE tablename.ID = d.ID 
          AND NOT (tablename.time BETWEEN d.minDate AND d.maxDate));

or, if you really like mathematical symbols, then

WITH date AS
(SELECT ID, 
MIN(time) AS minDate, 
MAX(time) AS maxDate
FROM tablename
GROUP BY ID)
DELETE FROM tablename
  WHERE 
    NOT EXISTS 
      (SELECT 1 FROM tablename
        WHERE tablename.ID = d.ID 
          AND NOT (tablename.time >= d.minDate AND tablename.time d.maxDate));

But it seems that the query fix by itself will not solve your problem. You have data in different formats in your example, which means that you store the date as text. You will need to fix your data as well, make sure you have a date/time field rather than a text field or, if you really really need (?!) or prefer them to be stored as a string, then make sure that they are converted for the purpose of the comparison into a format which alphabetically would yield the same order as logically for the dates.

EDIT

Your date/time information is also separated into two separate fields. My query assumes this information to be part of a single column. If you have two columns, then you can concatenate them, but really, you should unite them into a single field (that's actually a criteria for 1NF as well) and convert it to date/time.

Lajos Arpad
  • 64,414
  • 37
  • 100
  • 175
  • @JonasMetzler the other question was asked by someone else, apparently and it has been used as a model in this question. But the asker here did not quite understand it and needed to ask a question about it. At least he has done some research and tried something, a failed attempt, but I rewarded the effort by answering the question. – Lajos Arpad Jul 25 '23 at 14:18
  • 1
    Oh, sorry, my bad. I misread the first part of their question. And of course I agree it's good they did some research. Would be great if they store the entire datetime in one column. – Jonas Metzler Jul 25 '23 at 14:23
  • 2
    @JonasMetzler I missed the fact that the asker has split the date/time into two separate fields and written my original answer with the wrong impression (assumption) that it's a single field. Thanks for pointing out that this is actually not the case, I have added an edit to the answer to reflect on that. – Lajos Arpad Jul 25 '23 at 15:13