0

I'm a SQL novice, and usually figure things out via Google and SO, but I can't wrap my head around the SQL required for this.

My question is similar to Delete sql rows where IDs do not have a match from another table, but in my case I have a middle table that I have to query, so here's the scenario:

We have this INSTANCES table that basically lists all the occurrences of files sent to the database, but have to join with CROSS_REF so our reporting application knows which table to query for the report, and we just have orphaned INSTANCES rows I want to clean out. Each DETAIL table contains different fields from the other ones.

I want to delete all single records from INSTANCES if there are no records for that Instance ID in any DETAIL table. The DETAIL table got regularly cleaned of old files, but the Instance record wasn't cleaned up, so we have a lot of INSTANCE records that don't have any associated DETAIL data. The thing is, I have to select the Table Name from CROSS_REF to know which DETAIL_X table to look up the Instance ID.

In the below example then, since DETAIL_1 doesn't have a record with Instance ID = 1001, I want to delete the 1001 record from INSTANCES.

INSTANCES

Instance ID Detail ID
1000 123
1001 123
1002 234

CROSS_REF

Detail ID Table Name
123 DETAIL_1
124 DETAIL_2
125 DETAIL_3

DETAIL_1

Instance ID
1000
1000
2999
redOctober13
  • 3,662
  • 6
  • 34
  • 61
  • You need to use *dynamic sql* to accomplish such a task. So I would suggest to google for this term and find some examples on how you can construct dynamic queries where a table name participating in the query is not known beforehand. – Giorgos Betsos Jan 08 '21 at 14:52
  • Does not need dynamic SQL if it's just 3 fixed tables. How many tables do you have of `DETAIL` – Charlieface Jan 08 '21 at 14:56
  • @Charlieface around 200 tables. An Instance will have records in only one detail table, but in cases where the detail records have been deleted, I want to delete the instance record, since those are what the user picks from, and I don't want to show options for things which don't have detail data. – redOctober13 Jan 08 '21 at 17:29
  • Personally I would start by combining all the tables into one. The setup you have completely breaks Normal Forms rules – Charlieface Jan 09 '21 at 18:36

3 Answers3

0

Storing table names or column names in a database is almost always a sign for a bad database design. You may want to change this and thus get rid of this problem.

However, when knowing the possible table names, the task is not too difficult.

delete from instances i
where not exists
(
  select null
  from cross_ref cr
  left join detail_1 d1 on d1.instance_id = i.instance_id and cr.table_name = 'DETAIL_1'
  left join detail_2 d2 on d2.instance_id = i.instance_id and cr.table_name = 'DETAIL_2'
  left join detail_3 d3 on d3.instance_id = i.instance_id and cr.table_name = 'DETAIL_3'
  where cr.detail_id = i.detail_id
  and 
  (
    d1.instance_id is not null or
    d2.instance_id is not null or
    d3.instance_id is not null    
  )
);

(You can replace is not null by = i.instance_id, if you find that more readable. In that case you could even remove these criteria from the ON clauses.)

Thorsten Kettner
  • 89,309
  • 7
  • 49
  • 73
  • I'd happily hear any suggestions for a better database design as we made it without the help of an experienced SQL architect. The thing is that while I know the table names, there's 100+ of them, and I wanted to avoid writing 100 queries. But if I did use the table names, I see no reason to use the cross-reference table as per your suggestion since you're already joining to detail_X. – redOctober13 Jan 08 '21 at 15:10
  • 1
    @redOctober13 Another thing about your design that is less than ideal - why have 100+ detailXX tables instead of just one table with an extra column specifying the type of detail? Then your DELETE query becomes way simpler. It alos reduces the complexity of application interactions with the DB (i assume thats how these various detail tables are being filled). – Doug Coats Jan 08 '21 at 15:14
  • Well, each DETAIL table has a completely different set of columns (minus a few common fields). So we have this INSTANCES table that basically lists all the things, but have to join with CROSS_REF so our reporting application knows which table to query for the data, and we just have orphaned INSTANCES rows I want to clean out. – redOctober13 Jan 08 '21 at 15:51
  • @redOctober13 I dont even understand the relationship between instances and other tables. On its surface and without further context one seems unnecessary. – Doug Coats Jan 08 '21 at 15:53
  • @DougCoats Basically we have 100 different files (different layouts) that can be sent to the database. Each one gets its records loaded to it's own table (DETAIL). I have a cross-reference table that says FILE_X maps to DETAIL_X. Ugly, as others have noted, but I didn't design it. – redOctober13 Jan 08 '21 at 16:01
0

this answer assumes you have sequential data in the CROSS_REF table. If you do not, you'll need to alter this to account it (as it will bomb due to missing object reference).

However, this should give you an idea. It also could probably be written to do a more set based approach, but my answer is to demonstrate dynamic sql use. Be careful when using dynamic SQL though.

    DECLARE @Count INT, @Sql VARCHAR(MAX), @Max INT;

    SET @Count = (SELECT MIN(DetailID) FROM CROSS_REF)
    SET @Max = (SELECT MAX(DetailID) FROM CROSS_REF)

    WHILE @Count <= @Max
        BEGIN
            IF (select count(*) from CROSS_REF where file_id = @count) <> 0
            BEGIN
            SET @sql ='DELETE i
            FROM Instances i
            WHERE NOT EXISTS
                (
                    SELECT InstanceID
                    FROM '+(SELECT TableName FROM Cross_Ref WHERE DetailID=@Count)+' d
                    WHERE d.InstanceId=i.InstanceID
                    AND i.detailID ='+ cast(@Count as varchar) +'
                )
                AND i.detailID ='+ cast(@Count as varchar)
            EXEC(@sql);
            SET @Count=@Count+1
            END
        END
    
Doug Coats
  • 6,255
  • 9
  • 27
  • 49
  • There's a missing parenthesis here. Also, you should be ensuring to *securely* inject a dynamic object's name, just in case there's any "unfriendly" characters in there. Finally, you should really be using `sp_executesql`; `EXEC(@SQL)` can't be parametrised. [Bad Habits to Kick : Using EXEC() instead of sp_executesql](https://sqlblog.org/2011/09/17/bad-habits-to-kick-using-exec-instead-of-sp_executesql) – Thom A Jan 08 '21 at 15:44
  • @Larnu I did say this was only to demonstrate the idea not to mention i did say "be careful." I personally try not to use dynamoic sql if it can be avoided not to mention if the data structure suggested wasnt awful dynamic sql wouldnt even need to be used. Thanks though :P – Doug Coats Jan 08 '21 at 15:52
  • I'm not saying the answer is "wrong", Doug (apart from the missing right parenthesis after `@Count` which will cause a syntax error), just that they're good habits to use; securely quoting to ensure that no injection can happen and the latter as it makes it easier to parametrise and promotes it (parametrisation). :) – Thom A Jan 08 '21 at 15:54
  • What's protecting from deleting all the INSTANCE rows that don't have a match in the DETAIL table you're looking at, but might have a match in another detail table? – redOctober13 Jan 08 '21 at 15:55
  • 1
    @Larnu youre right, sorry didnt mean to sound snoody. I couldve included more detail about why one should be careful about dynamic sql. – Doug Coats Jan 08 '21 at 15:55
  • @redOctober13 I think I glossed over the fact there is a detailID in instances, let me alter it for you. – Doug Coats Jan 08 '21 at 15:57
  • if not sequential you will need an if check - i.e. IF (SELECT...) IS NOT NULL. Make sense? – Doug Coats Jan 08 '21 at 16:09
  • Which `select` does that go around, inner or outer? – redOctober13 Jan 08 '21 at 17:05
  • Sorry @DougCoats, just realized after all this back and forth that your query doesn't look at the detail table itself to see whether there's any data for the Instance ID, and that's the critical piece. If I can create a view with the Instance ID, Detail ID, Table Name, and number of detail records for that instance ID, I should be able to solve this. – redOctober13 Jan 08 '21 at 17:17
  • @RedOctober13 it does though - the NOT EXISTS sub query is using the Count variable to find the TableName from CROSS_REF. – Doug Coats Jan 08 '21 at 17:39
  • @DougCoats I tried your version, and it ended up deleting all instance records after 2 loops. I need to delete only where there's an Instance that *SHOULD* have detail data, but there's not actually any detail records in that table for that instance. – redOctober13 Jan 08 '21 at 19:38
  • I just had to add another `AND i.detailID ='+cast(@Count as varchar)+'` after the final closing parentheses in the WHERE clause, and that did it! I still didn't figure out where to add `IF(select...) IS NULL` because SQL Server complained about it when I put that right inside the outer WHERE clause. – redOctober13 Jan 08 '21 at 20:03
0

Much thanks to @DougCoats, this is what I ended up with.

So here's what I ended up with (@Doug, if you want to update your answer, I'll mark yours correct).

DECLARE @Count INT, @Sql VARCHAR(MAX), @Max INT;

SET @Count = (SELECT MIN(DetailID) FROM CROSS_REF)
SET @Max = (SELECT MAX(DetailID) FROM CROSS_REF)

WHILE @Count <= @Max
    BEGIN
        IF (select count(*) from CROSS_REF where file_id = @count) <> 0
        BEGIN
        SET @sql ='DELETE i
        FROM Instances i
        WHERE NOT EXISTS
            (
                SELECT InstanceID
                FROM '+(SELECT TableName FROM Cross_Ref WHERE DetailID=@Count)+' d
                WHERE d.InstanceId=i.InstanceID
                AND i.detailID ='+ cast(@Count as varchar) +'
            )
            AND i.detailID ='+ cast(@Count as varchar)
        EXEC(@sql);
        SET @Count=@Count+1
        END
    END
redOctober13
  • 3,662
  • 6
  • 34
  • 61