-1

I have a business requirement to delete all records in from multiple databases older than 1 year. Is there a way to do this at the database level, or must each individual table must be scripted?

I see there are various ways to do this at the table level, but I was wondering if it can be done at the database level?

Table level: (for anyone looking for how to delete at the table level):

JM1
  • 1,595
  • 5
  • 19
  • 41
  • 2
    You need to delete the rows or partitions from each table, although you can set up a script to read the system tables to generate dynamic SQL. – Gordon Linoff Jul 20 '20 at 10:38
  • 1
    Use `system.tables` to find columns with date type - `date`, `datetime`, `datetime2`. Then, check foreign constraints and built the `DELETE` statements based on their order. – gotqn Jul 20 '20 at 11:00

1 Answers1

1

The essence of this will be using SQL to write SQL. Here's a starting point:

SELECT DISTINCT REPLACE(REPLACE(REPLACE(
  'DELETE FROM {s}.{t} WHERE {c} < DATEADD(YEAR, -1, GetUtcDate())'
  '{s}', table_name),
  '{t}', table_name),
  '{c}', column_name)
FROM
  INFORMATION_SCHEMA.COLUMNS
WHERE
  column_name like '%Created%'

It queries the info schema for a list of all the columns containing the word Created (assumption: that your tables contain columns with names like CreatedAt, CreatedOn, CreatedDate, Created, RecordCreated etc; adjust for your scenario) and puts the table/column names into a string that is an SQL that deletes data

You run this, then you copy the results out of the grid and into another query window and run it. It is thus an SQL that writes SQL

After you get down with the concept, you could look to automate it by having some program or dynamic SQL select the delete queries and run them

You can also "level up" if you have multiple databases, by making a query that hits sys.databases to find out all the database names that need querying, and then write an SQL that hits the DB list and generates variations of this sql that hits a specific DB, that is an SQL that hits specific tables for deletion

It's like Inception; be sure you understand the outermost layer of the onion before you start digging deeper

Bear in mind also that your data may be subject to constraints that mean you either have to:

  • delete them in a certain order or
  • keep re-running the delete until no more deletes are done (the first time you try a parent-then-child delete it may fail becuase children exist, the second time you might have deleted all the children so the parent can be removed - if any children remain this route fails, but maybe you want it to fail) or
  • configure your constraints to delete cascade so that deleting a record from the parent deletes records from the children

Warning: this is a seriously destructive action, and you could easily end up removing more than you intended or leaving your database in a state where referential integrity is broken and your app stops working.

This is not something that is just flippently designed in a quick 5 minutes of an SO question; this is a weeks+ long project to make sure you're only deleting relevant stuff. For example configuring the keys for delete cascade and then accidentally removing the "Application Status: Approved" enum record just because it was put into the DB more than a year ago could wipe out the entire database, erasing every approved application (even those approved yesterday), every account that came from it, all their transactions etc.

Think very carefully about this

Caius Jard
  • 72,509
  • 5
  • 49
  • 80
  • What is the name for "SQL writing SQL queries"? – xhr489 Jul 21 '20 at 09:45
  • At one level deep, cannibaliSQLm. At two levels, a headache. At 3+ levels, a mess ;) – Caius Jard Jul 21 '20 at 10:33
  • 1
    Oh, I guess you could call it metaprogramming or something trendy - try https://en.wikipedia.org/wiki/Automatic_programming but consider that nearly everything we do with a computer, programmingwise, is some form of code that writes code (everyone operates at some level of abstraction above flipping logic gates in the CPU) so you might just call it "programming" – Caius Jard Jul 21 '20 at 19:29
  • Makes sense. I was just reading about metaprogramming on a website. Is dynamic SQL also considered "metaprogramming".? Thanks – xhr489 Jul 21 '20 at 20:20
  • 1
    I suppose it depends on your perspective; dynamically forming an SQL, in another programming language, which is essentially a program that will be compiled, executed and produce an output would probably qualify for me.. But then I've just said that writing a C#, and compiling it to something that is compiled to something that is compiled to something that runs on the hardware is also metaprogramming, which is programming. What qualifies as X only really matters in a conversation between you and someone else where you both understand what you each mean by X. Compiler philosophy! – Caius Jard Jul 21 '20 at 20:24