-1

I have 900 Millons rows in a table. How to delete row in each time, where on condition more than 1000 days.

I am using below query:

delete
FROM db.dbo.table
WHERE [DateIn] <= DATEADD(DAY, -1000, GETDATE());

It should delete more than 1000 days but each time deletes doesn't do what I expect.

Thom A
  • 88,727
  • 11
  • 45
  • 75
jack
  • 13
  • 4
  • What do you mean by it deletes 1 lack records? Do you mean it lacks to delete 1 row, so 999 rows, instead of 1,000? – Thom A Jan 06 '23 at 14:29
  • 1
    That logic doesn't delete 1,000 rows, it deletes ALL rows where the DateIn is more than 1,000 days ago. As of today that would be April 11, 2020. I have a feeling that is not what you are trying to do. – Sean Lange Jan 06 '23 at 14:32
  • I've reworded your question, hopefully it makes more sense. Can you explain why the delete isn't working as you want it to? The above would delete any rows where the value of `DateIn` is less than 1,000 days ago; so for today that would be where the `DateIn` is on or prior to 2020-04-11 {time that query was run}. – Thom A Jan 06 '23 at 14:33
  • my intention is to delete 100000 records not impact on my server memory each time from millions records – jack Jan 06 '23 at 14:34
  • A delete that effect *many rows* can/will use resources. Especially if there are things like indexes to update, foreign keys to cascade, triggers to trigger, etc. – Thom A Jan 06 '23 at 14:35
  • Perhaps you would be better off batching your `DELETE`, if you find that deleting 100,000 in one go is too demanding on the instance; though SQL Server can normally easily handle such voumes. – Thom A Jan 06 '23 at 14:37
  • @Larnu the deleted work but i want to keep 1000 days records after that i want to use condition to delete 100000 records each time – jack Jan 06 '23 at 14:38
  • And the above *will* keep any rows where `DateIn` has a value *after* 2020-04-11 {time that query was run}, so what is your point? What's the problem exactly? – Thom A Jan 06 '23 at 14:39
  • 2
    As a side note, you really need to get your instance upgraded. SQL Server 2008 has been *completely* unsupported since mid 2019; it has not had any security updates is over 3.5 years. it has known security flaws and continuing to use it so long after the support ended can be considered quite irresponsible. – Thom A Jan 06 '23 at 14:40
  • 2
    I suspect "lack" was originally "lakh" which is a word for 100,000. I think OP wants to delete anything older than 1,000 days but chunk it up into multiple transactions of 100,000 rows at once – RickyTillson Jan 06 '23 at 14:48
  • @RickyTillson Thanks; good to know. From now on I use that instead of a number :) Yay – Mark Schultheiss Jan 06 '23 at 14:52
  • @RickyTillson correct – jack Jan 06 '23 at 14:53
  • What you want then, @jack, is batching. Does this answer your question? [How to delete large data of table in SQL without log?](https://stackoverflow.com/questions/24213299/how-to-delete-large-data-of-table-in-sql-without-log) – Thom A Jan 06 '23 at 14:55
  • I want to keep 1000 days records then after i want to delete 100000 records each time – jack Jan 06 '23 at 15:07

1 Answers1

0

Consider backing up (always back up) first to another table or otherwise as delete recovery is quite expensive. Disclaimer: I did NOT test these and this is just a super quick post here. Always test prior to any execution against production data.

This should do the trick but you may consider a smaller chunk size. This may or may not perform well in your circumstance;

WHILE EXISTS (SELECT * FROM db.dbo.table WHERE [DateIn] <= DATEADD(DAY, -1000, CURRENT_TIMESTAMP))
BEGIN
BEGIN TRANSACTION
DELETE TOP (100000)
FROM db.dbo.table
WHERE [DateIn] <= DATEADD(DAY, -1000, CURRENT_TIMESTAMP)
COMMIT TRANSACTION
END

CTE Example (might consider a smaller size of deletions) This has the advantage of working from the oldest date forward;

WHILE EXISTS (SELECT * FROM db.dbo.table WHERE [DateIn] <= DATEADD(DAY, -1000, CURRENT_TIMESTAMP))
BEGIN
BEGIN TRANSACTION
;WITH deleteCTE AS (
    SELECT TOP (100000) *
    FROM db.dbo.table
    WHERE [DateIn] <= DATEADD(DAY, -1000, CURRENT_TIMESTAMP)
    ORDER BY [DateIn] ASC)
DELETE FROM deleteCTE;
COMMIT TRANSACTION
END

Consider alternatives such as creating a table with a set of ID's to delete with some sort of "group indicator" for dates/groups and walking through that with a join; things like that.

Mark Schultheiss
  • 32,614
  • 12
  • 69
  • 100
  • FWIW I only added this here for the CTE example (it is really just syntactic sugar) - strongly consider the options on the other question linked by otheres. – Mark Schultheiss Jan 06 '23 at 16:45