How can I delete one of two perfectly identical rows?

Question

I am cleaning out a database table without a primary key (I know, I know, what were they thinking?). I cannot add a primary key, because there is a duplicate in the column that would become the key. The duplicate value comes from one of two rows that are in all respects identical. I can't delete the row via a GUI (in this case MySQL Workbench, but I'm looking for a database agnostic approach) because it refuses to perform tasks on tables without primary keys (or at least a UQ NN column), and I cannot add a primary key, because there is a duplicate in the column that would become the key. The duplicate value comes from one...

How can I delete one of the twins?

@Alnitak in the original context of this question, just one or two - I've just discovered another table with *loads* of duplciates where a strategy like http://stackoverflow.com/a/3777663/236081 will probably be more appropriate — lofidevops, May 08 '13 at 13:01

score 72 · Answer 1 · answered Oct 25 '13 at 13:40

72

SET ROWCOUNT 1
DELETE FROM [table] WHERE ....
SET ROWCOUNT 0

This will only delete one of the two identical rows

answered Oct 25 '13 at 13:40

Rinaldo

721
1
5
2

4

By far the simplest of solutions.. no need for duplicate tables or altering the current table like all other suggestions. – woony Jun 25 '15 at 12:15
11

Even simpler would be `DELETE TOP 1 FROM ...`. This is also better in future, if using SQL Server: [Remarks - Important - Using SET ROWCOUNT will not affect DELETE, INSERT, and UPDATE statements in a future release of SQL Server](https://technet.microsoft.com/en-us/library/ms188774.aspx) – ToolmakerSteve Jul 11 '15 at 23:37
2

An explanation would be good. – Jorge Nachtigall Sep 16 '21 at 18:26
1

The question currently doesn't specify a database system, and I think this is not standard SQL... And even for SQL Server it looks like it's out of date. – Andrew Spencer Nov 19 '21 at 09:05
Syntax update: SQL Server might require parenthesis now: DELETE TOP (1) FROM ... – sobrio35 Aug 01 '23 at 16:42

score 38 · Accepted Answer · answered May 08 '13 at 13:30

38

One option to solve your problem is to create a new table with the same schema, and then do:

INSERT INTO new_table (SELECT DISTINCT * FROM old_table)

and then just rename the tables.

You will of course need approximately the same amount of space as your table requires spare on your disk to do this!

It's not efficient, but it's incredibly simple.

answered May 08 '13 at 13:30

Alnitak

334,560
70
407
495

I still think my colleague's solution (in my answer) is neat, but yours is indeed super simple and you explain the caveats well – lofidevops May 08 '13 at 14:05
hmmm, this actually didn't work :( for some reason new_table contained everything from old_table - any ideas? – lofidevops May 10 '13 at 12:42
1

I just tried it on a trivial table and it worked exactly as expected – Alnitak May 10 '13 at 13:15
must have been me - I did http://stackoverflow.com/a/3312066/236081 in the meanwhile, which also worked (this is a non-production database so I didn't mind the warnings) – lofidevops May 10 '13 at 13:32
Far from an actual solution if you can't create a new table. – Oct 31 '19 at 13:52
For SQL Server it would be:- SELECT DISTINCT * INTO new_table FROM old_table – Rahul Satal May 10 '20 at 12:54

score 27 · Answer 3 · edited May 31 '23 at 12:34

27

delete top(1) works on Microsoft SQL Server (T-SQL).

edited May 31 '23 at 12:34

RubberDuck

11,933
4
50
95

answered Dec 01 '15 at 08:35

mattinsalto

2,146
1
25
27

2

This is the simplest and easiest answer to the question I have found. Thank you. – cklimowski Dec 18 '19 at 21:55

Piskvor left the building · Answer 4 · 2013-05-08T12:04:21.580

25

Note that MySQL has its own extension of DELETE, which is DELETE ... LIMIT, which works in the usual way you'd expect from LIMIT: http://dev.mysql.com/doc/refman/5.0/en/delete.html

The MySQL-specific LIMIT row_count option to DELETE tells the server the maximum number of rows to be deleted before control is returned to the client. This can be used to ensure that a given DELETE statement does not take too much time. You can simply repeat the DELETE statement until the number of affected rows is less than the LIMIT value.

Therefore, you could use DELETE FROM some_table WHERE x="y" AND foo="bar" LIMIT 1; note that there isn't a simple way to say "delete everything except one" - just keep checking whether you still have row duplicates.

edited May 08 '13 at 12:04

answered May 08 '13 at 11:54

Piskvor left the building

91,498
46
177
222

that's good to know; I tried to pose my question in a database agnostic way because, who knows, I might be dealing with PostgreSQL next time, do you know if this is a common extension? – lofidevops May 08 '13 at 12:42
MySQL only, sorry. (I have yet to see this implemented anywhere else) – Piskvor left the building May 08 '13 at 12:46

score 12 · Answer 5 · answered Feb 05 '16 at 16:41

This can be accomplished using a CTE and the ROW_NUMBER() function, as below:

/* Sample Data */
    CREATE TABLE #dupes (ID INT, DWCreated DATETIME2(3))

    INSERT INTO #dupes (ID, DWCreated) SELECT 1, '2015-08-03 01:02:03.456'
    INSERT INTO #dupes (ID, DWCreated) SELECT 2, '2014-08-03 01:02:03.456'
    INSERT INTO #dupes (ID, DWCreated) SELECT 1, '2013-08-03 01:02:03.456'

/* Check sample data - returns three rows, with two rows for ID#1 */
    SELECT * FROM #dupes 

/* CTE to give each row that shares an ID a unique number */
    ;WITH toDelete AS
      (
        SELECT ID, ROW_NUMBER() OVER (PARTITION BY ID ORDER BY DWCreated) AS RN
        FROM #dupes 
      )

  /* Delete any row that is not the first instance of an ID */
    DELETE FROM toDelete WHERE RN > 1

/* Check the results: ID is now unique */
    SELECT * FROM #dupes

/* Clean up */
    DROP TABLE #dupes

Having a column to ORDER BY is handy, but not necessary unless you have a preference for which of the rows to delete. This will also handle all instances of duplicate records, rather than forcing you to delete one row at a time.

Yes. It's standard SQL isn't it? Should work on all modern databases. But this example needs adapting to fit the original question, which asked what to do when there is no PK. Instead of `PARITION BY ID` you need to `PARTITION BY COL1,COL2,COL3` (giving the list of all columns, or all that should be unique) — Andrew Spencer, Nov 19 '21 at 09:19

score 10 · Answer 6 · answered Jan 16 '18 at 21:09

10

This works for PostgreSQL

DELETE FROM tablename WHERE id = 123 AND ctid IN (SELECT ctid FROM tablename WHERE id = 123 LIMIT 1)

answered Jan 16 '18 at 21:09

Vlad B

101
1
2

La solución funciona perfectamente, solo sería prudente añadir que ( ctid ) no es un campo de la tabla y que debe ir por defecto. – Alexa Aug 29 '22 at 13:39
Note that `ctid` is a Postgres system column. – Garrett Oct 15 '22 at 17:43

Turbut Alin · Answer 7 · 2016-11-07T13:56:02.123

9

For PostgreSQL you can do this:

DELETE FROM tablename
WHERE id IN (SELECT id
          FROM (SELECT id, ROW_NUMBER() 
               OVER (partition BY column1, column2, column3 ORDER BY id) AS rnum
               FROM tablename) t
          WHERE t.rnum > 1);

column1, column2, column3 would the column set which have duplicate values.

Reference here.

edited Nov 07 '16 at 13:56

answered Nov 07 '16 at 13:52

Turbut Alin

2,568
1
21
30

1

Whilst this may theoretically answer the question, [it would be preferable](//meta.stackoverflow.com/q/8259) to include the essential parts of the answer here, and provide the link for reference. – Bhargav Rao Nov 07 '16 at 13:53
3

Doesn't this still require `id` to be unique between the rows? – Narfanator May 18 '17 at 19:29
I believe this answer is also standard-compliant, so it should probably work with almost every modern RDBMS. – hemflit Jun 11 '18 at 10:11
Standards-compliant is good, but it still doesn't answer the original question, which was to resolve this issue when there is no unique ID – Andrew Spencer Nov 19 '21 at 09:04

score 7 · Answer 8 · answered Dec 31 '16 at 01:20

7

Tried LIMIT 1? This will only delete 1 of the rows that match your DELETE query

DELETE FROM `table_name` WHERE `column_name`='value' LIMIT 1;

answered Dec 31 '16 at 01:20

Feelsbadman

1,163
4
17
37

2

SQL Server didn't accept for me...said incorrect syntax near 'limit' – Daniel Galarza Aug 23 '18 at 20:25

score 4 · Answer 9 · answered May 08 '13 at 10:29

In my case I could get the GUI to give me a string of values of the row in question (alternatively, I could have done this by hand). On the suggestion of a colleague, in whose debt I remain, I used this to create an INSERT statement:

INSERT
'ID1219243408800307444663', '2004-01-20 10:20:55', 'INFORMATION', 'admin' (...)
INTO some_table;

I tested the insert statement, so that I now had triplets. Finally, I ran a simple DELETE to remove all of them...

DELETE FROM some_table WHERE logid = 'ID1219243408800307444663';

followed by the INSERT one more time, leaving me with a single row, and the bright possibilities of a primary key.

score 4 · Answer 10 · answered May 08 '13 at 11:59

in case you can add a column like

  ALTER TABLE yourtable ADD IDCOLUMN bigint NOT NULL IDENTITY (1, 1)

do so.

then count rows grouping by your problem column where count >1 , this will identify your twins (or triplets or whatever).

then select your problem column where its content equals the identified content of above and check the IDs in IDCOLUMN.

delete from your table where IDCOLUMN equals one of those IDs.

score 4 · Answer 11 · answered Sep 18 '17 at 10:09

You could use a max, which was relevant in my case.

DELETE FROM [table] where id in 
(select max(id) from [table] group by id, col2, col3 having count(id) > 1)

Be sure to test your results first and having a limiting condition in your "having" clausule. With such a huge delete query you might want to update your database first.

score 1 · Answer 12 · answered Mar 05 '22 at 12:58

1

delete top(1) tableNAme 
where --your conditions for filtering identical rows

answered Mar 05 '22 at 12:58

Behnam

1,039
2
14
39

score 0 · Answer 13 · answered Jan 16 '14 at 12:31

0

I added a Guid column to the table and set it to generate a new id for each row. Then I could delete the rows using a GUI.

answered Jan 16 '14 at 12:31

Ian Warburton

15,170
23
107
189

volvpavl · Answer 14 · 2017-04-11T11:16:26.057

In PostgreSQL there is an implicit column called ctid. See the wiki. So you are free to use the following:

WITH cte1 as(
    SELECT unique_column, max( ctid ) as max_ctid
    FROM table_1
    GROUP BY unique_column
    HAVING count(*) > 1
), cte2 as(
    SELECT t.ctid as target_ctid
    FROM table_1 t
    JOIN cte1 USING( unique_column )
    WHERE t.ctid != max_ctid
)
DELETE FROM table_1
WHERE ctid IN( SELECT target_ctid FROM cte2 )

I'm not sure how safe it is to use this when there is a possibility of concurrent updates. So one may find it sensible to make a LOCK TABLE table_1 IN ACCESS EXCLUSIVE MODE; before actually doing the cleanup.

detzu · Answer 15 · 2019-04-16T04:13:00.643

In case there are multiple duplicate rows to delete and all fields are identical, no different id, the table has no primary key , one option is to save the duplicate rows with distinct in a new table, delete all duplicate rows and insert the rows back. This is helpful if the table is really big and the number of duplicate rows is small.

---  col1 , col2 ... coln are the table columns that are relevant. 
--- if not sure add all columns of the table in the select bellow and the where clause later. 

--- make a copy of the table T to be sure you can rollback anytime , if possible
--- check the @@rowcount to be sure it's what you want
--- use transactions and rollback in case there is an error 

--- first find all with duplicate rows that are identical , this statement could be joined 
--- with the first one if you choose all columns 

select col1,col2, --- other columns as needed
  count(*) c into temp_duplicate group by col1,col2 having count(*) > 1 

--- save all the rows that are identical only once ( DISTINCT ) 

insert distinct * into temp_insert from T , temp_duplicate D where
T.col1 = D.col1 and
T.col2 = D.col2 --- and other columns if needed

--- delete all the rows that are duplicate

delete T from T , temp_duplicate D where 
T.col1 = D.col1 and
T.col2 = D.col2 ---- and other columns if needed

--- add the duplicate rows , now only once
insert into T select * from temp_insert 

--- drop the temp tables after you check all is ok

score 0 · Answer 16 · answered Sep 17 '19 at 03:10

If, like me, you don't want to have to list out all the columns of the database, you can convert each row to JSONB and compare by that.

(NOTE: This is incredibly inefficient - be careful!)

select to_jsonb(a.*), to_jsonb(b.*)
FROM
    table a
        left join table b
on
    a.entry_date < b.entry_date
where (SELECT NOT exists(
    SELECT
    FROM jsonb_each_text(to_jsonb(a.*) - 'unwanted_column') t1
         FULL OUTER JOIN jsonb_each_text(to_jsonb(b.*) - 'unwanted_column') t2 USING (key)
    WHERE t1.value<>t2.value OR t1.key IS NULL OR t2.key IS NULL
))

score 0 · Answer 17 · edited Jun 13 '21 at 10:42

0

Suppose we want to delete duplicate records with keeping only 1 unique records from Employee table - Employee(id,name,age)

delete from Employee
where id not in (select MAX(id)
                  from Employee
                  group by (id,name,age)
                );

edited Jun 13 '21 at 10:42

gurkan

884
4
16
25

answered Jun 13 '21 at 01:37

Nikhil Gupta

31
4

`age` field is not the right approach. You should change this field in the answer. – gurkan Jun 13 '21 at 08:58

score 0 · Answer 18 · answered Oct 07 '21 at 08:38

0

You can use limit 1

This works perfectly for me with MySQL

delete from `your_table` [where condition] limit 1;

answered Oct 07 '21 at 08:38

BenSmile

39
3

score 0 · Answer 19 · answered Jul 14 '22 at 22:41

0

DELETE FROM Table_Name

WHERE ID NOT IN
(
    SELECT MAX(ID) AS MaxRecordID
    FROM Table_Name
    GROUP BY [FirstName], 
             [LastName], 
             [Country]
);

answered Jul 14 '22 at 22:41

Fahad Ashiq

57
2
10

How can I delete one of two perfectly identical rows?

19 Answers19

Linked