0

I'm having a little trouble trying to delete multiple entries from my database which are very similar but not exactly the same. I have looked here and many other solutions:

Delete Duplicate Records in PostgreSQL

How to delete duplicate rows with SQL?

Everytime I try to delete multiple duplicates from the table, The command removes all the entries instead of just removing the duplicates.

Here is the table with a duplicate sample, where we should only keep one unique hdrtime:

SELECT * from stdtextproducts where xxxid='DEN' AND nnnid='MTR' and hdrtime='270600';
 cccid |  datacrc   | hdrtime | nnnid | site | wmoid  | xxxid | bbbid |       inserttime        |                                 product                                  |    reftime    
-------+------------+---------+-------+------+--------+-------+-------+-------------------------+--------------------------------------------------------------------------+---------------
 DEN   | 3680361181 | 270600  | MTR   | KDEN | SAUS70 | DEN   | RRF   | 2018-08-27 05:55:51.811 | SAUS70 KDEN 270600 RRF                                                  +| 1535349351811
       |            |         |       |      |        |       |       |                         | METAR KDEN 270553Z 22017KT 10SM BKN150 OVC200 23/06 A2991 RMK AO2 PK WND+| 
       |            |         |       |      |        |       |       |                         |      22026/0456 SLP028 T02330056 10289 20222 58004                       | 
 DEN   | 1538417601 | 270600  | MTR   | KDEN | SAUS70 | DEN   | RRM   | 2018-08-27 05:57:57.356 | SAUS70 KDEN 270600 RRM                                                  +| 1535349477356
       |            |         |       |      |        |       |       |                         | METAR KDEN 270553Z 22017KT 10SM BKN150 OVC200 23/06 A2991 RMK AO2 PK WND+| 
       |            |         |       |      |        |       |       |                         |      22026/0456 SLP028 T02330056 10289 20222 58004                       | 
(2 rows)

I have tried the following:

DELETE FROM stdtextproducts a USING (SELECT MIN(ctid) as ctid, hdrtime FROM stdtextproducts GROUP BY hdrtime HAVING COUNT(*) > 1) b WHERE a.hdrtime = b.hdrtime AND a.ctid <> b.ctid;

And the following:

DELETE FROM stdtextproducts WHERE reftime NOT IN (SELECT MAX(reftime) FROM stdtextproducts GROUP BY hdrtime);

I should be expecting only one entry to show up on the list but it seems none of the entries are there anymore.

SELECT * from stdtextproducts where xxxid='DEN' AND nnnid='MTR' and hdrtime='270600';
 cccid | datacrc | hdrtime | nnnid | site | wmoid | xxxid | bbbid | inserttime | product | reftime 
-------+---------+---------+-------+------+-------+-------+-------+------------+---------+---------
(0 rows)

What am I missing here?

Thanks in advance.

Ray Y
  • 1,261
  • 3
  • 16
  • 24
  • DELETE FROM stdtextproducts WHERE reftime NOT IN (SELECT MAX(reftime) FROM stdtextproducts b GROUP BY stdtxtproducts.hdrtime=b.hdrtime) just a guess – Joe Love Aug 27 '18 at 19:19

2 Answers2

1

try like below

  DELETE FROM stdtextproducts a where
   a.ctid<>   
    (SELECT MIN(b.ctid) as ctid
    FROM stdtextproducts b
    where a.hdrtime=b.hdrtime  
    ) 

or

   DELETE FROM stdtextproducts T1 
   USING   stdtextproducts  T2
    WHERE   T1.ctid < T2.ctid  -- delete the older versions
    AND T1.hdrtime= T2.hdrtime ;  -- add more columns if needed
Zaynul Abadin Tuhin
  • 31,407
  • 5
  • 33
  • 63
  • I tried the 1st option, but it is taking too long. The 2nd option deletes everything in the table. Oh and T1.DEN should be T1.xxxid column. – Ray Y Aug 27 '18 at 17:44
  • @RayY for 2nd option you need key join and yes 1st one is too slow – Zaynul Abadin Tuhin Aug 27 '18 at 17:47
  • 1
    Thanks it is working now, for others looking for the solution from google, I had tweaked your 2nd method a bit and came up with the following that works: DELETE FROM stdtextproducts a USING stdtextproducts b WHERE a.ctid < b.ctid AND a.site = b.site AND a.hdrtime = b.hdrtime AND a.nnnid = b.nnnid; – Ray Y Aug 27 '18 at 20:26
0
DELETE FROM stdtextproducts a where
   exists 
    (SELECT *
    FROM stdtextproducts b
    where a.hdrtime=b.hdrtime 
    and b.ctid<a.ctid
    ) 

If this does not work, are you sure there isn't another KEY here besides just hrdtime?

Joe Love
  • 5,594
  • 2
  • 20
  • 32