96

What's the best way to delete all rows from a table in sql but to keep n number of rows on the top?

Riri
  • 11,501
  • 14
  • 63
  • 88

11 Answers11

91
DELETE FROM Table WHERE ID NOT IN (SELECT TOP 10 ID FROM Table)

Edit:

Chris brings up a good performance hit since the TOP 10 query would be run for each row. If this is a one time thing, then it may not be as big of a deal, but if it is a common thing, then I did look closer at it.

Imran Ali Khan
  • 8,469
  • 16
  • 52
  • 77
Cory Foy
  • 7,202
  • 4
  • 31
  • 34
  • 6
    Just a note that you can solve the subquery performance issue via either creating a temporary table manually (assuming this is an infrequent operation) or writing the query as `DELETE FROM Table WHERE ID NOT IN (SELECT id FROM (SELECT TOP 10 ID FROM Table) AS x)` to force MySQL to create a temporary table. – Michael Mior Jul 19 '11 at 16:46
  • 1
    The subquery run multiple times, is it true ? http://stackoverflow.com/questions/18790796/does-the-mysql-in-clause-execute-the-subquery-multiple-times – djluis Oct 27 '16 at 09:27
  • 5
    @ Daniel Schaffer Doesn't sound like they got any db or business logic issue. Sounds like a totally normal retention policy. – Hejazzman May 17 '17 at 16:52
  • What if I don't have primary key in my table? – Riosant Apr 15 '21 at 12:06
  • not work in postgres (9.5.x) – Siwei Feb 26 '22 at 01:02
33

I would select ID column(s) the set of rows that you want to keep into a temp table or table variable. Then delete all the rows that do not exist in the temp table. The syntax mentioned by another user:

DELETE FROM Table WHERE ID NOT IN (SELECT TOP 10 ID FROM Table)

Has a potential problem. The "SELECT TOP 10" query will be executed for each row in the table, which could be a huge performance hit. You want to avoid making the same query over and over again.

This syntax should work, based what you listed as your original SQL statement:

create table #nuke(NukeID int)

insert into #nuke(Nuke) select top 1000 id from article

delete article where not exists (select 1 from nuke where Nukeid = id)

drop table #nuke
Sam
  • 28,421
  • 49
  • 167
  • 247
Chris Miller
  • 4,809
  • 4
  • 33
  • 50
  • 4
    `insert into #nuke(Nuke) ...` should probably be: `insert into #nuke(NukeID) ...` Also the name nuke is confusing because you are trying to NOT delete these rows. nuke is probably named after the fact that it will be deleted. – Emond Dec 06 '16 at 12:02
14

Future reference for those of use who don't use MS SQL.

In PostgreSQL use ORDER BY and LIMIT instead of TOP.

DELETE FROM table
WHERE id NOT IN (SELECT id FROM table ORDER BY id LIMIT n);

MySQL -- well...

Error -- This version of MySQL does not yet support 'LIMIT & IN/ALL/ANY/SOME subquery'

Not yet I guess.

Joël Salamin
  • 3,538
  • 3
  • 22
  • 33
Simurr
  • 672
  • 1
  • 4
  • 19
10

Here is how I did it. This method is faster and simpler:

Delete all but top n from database table in MS SQL using OFFSET command

WITH CTE AS
    (
    SELECT  ID
    FROM    dbo.TableName
    ORDER BY ID DESC
    OFFSET 11 ROWS
    )
DELETE CTE;

Replace ID with column by which you want to sort. Replace number after OFFSET with number of rows which you want to keep. Choose DESC or ASC - whatever suits your case.

NapkinBob
  • 632
  • 7
  • 19
Hrvoje
  • 13,566
  • 7
  • 90
  • 104
6

I think using a virtual table would be much better than an IN-clause or temp table.

DELETE 
    Product
FROM
    Product
    LEFT OUTER JOIN
    (
        SELECT TOP 10
            Product.id
        FROM
            Product
    ) TopProducts ON Product.id = TopProducts.id
WHERE
    TopProducts.id IS NULL
Tim Wilson
  • 365
  • 3
  • 10
4

This really is going to be language specific, but I would likely use something like the following for SQL server.

declare @n int
SET @n = SELECT Count(*) FROM dTABLE;
DELETE TOP (@n - 10 ) FROM dTable

if you don't care about the exact number of rows, there is always

DELETE TOP 90 PERCENT FROM dTABLE;
Noah
  • 15,080
  • 13
  • 104
  • 148
2

I don't know about other flavors but MySQL DELETE allows LIMIT.

If you could order things so that the n rows you want to keep are at the bottom, then you could do a DELETE FROM table LIMIT tablecount-n.

Edit

Oooo. I think I like Cory Foy's answer better, assuming it works in your case. My way feels a little clunky by comparison.

Community
  • 1
  • 1
Mark Biek
  • 146,731
  • 54
  • 156
  • 201
1

I've got a trick to avoid executing the TOP expression for every row. We can combine TOP with MAX to get the MaxId we want to keep. Then we just delete everything greater than MaxId.

-- Declare Variable to hold the highest id we want to keep. 
DECLARE @MaxId as int = (
SELECT MAX(temp.ID)
FROM (SELECT TOP 10 ID FROM table ORDER BY ID ASC) temp
)

-- Delete anything greater than MaxId. If MaxId is null, there is nothing to delete.
IF @MaxId IS NOT NULL
    DELETE FROM table WHERE ID > @MaxId

Note: It is important to use ORDER BY when declaring MaxId to ensure proper results are queried.

clamchoda
  • 4,411
  • 2
  • 36
  • 74
0

I would solve it using the technique below. The example expect an article table with an id on each row.

Delete article where id not in (select top 1000 id from article)

Edit: Too slow to answer my own question ...

Riri
  • 11,501
  • 14
  • 63
  • 88
0

Refactored?

Delete a From Table a Inner Join (
    Select Top (Select Count(tableID) From Table) - 10) 
        From Table Order By tableID Desc
) b On b.tableID = A.tableID

edit: tried them both in the query analyzer, current answer is fasted (damn order by...)

Shawn
  • 19,465
  • 20
  • 98
  • 152
0

Better way would be to insert the rows you DO want into another table, drop the original table and then rename the new table so it has the same name as the old table

SQLMenace
  • 132,095
  • 25
  • 206
  • 225