0

Trying to figure out the most efficient way write a query to keep a table up to date.

For example I have 3 tables and let's call them TableA, TableB and TableC

TableA is the one needs to be kept up to date with unique codes. All 3 tables have a column named Code.

The query must look inside TableB and TableC to find unique codes that do not exist in TableA and insert them into TableA.

Note that both TableB and TableC tables have ~55k records in each and TableA ~1k. Number of records in tables rarely changes. Max 1k records a year in total.

First option I came up with the following but didn't like the execution plan:

INSERT INTO TableA (Code)
SELECT DISTINCT Code FROM TableB WHERE Code NOT IN(SELECT Code FROM TableA)
UNION
SELECT DISTINCT Code FROM TableC WHERE Code NOT IN(SELECT Code FROM TableA)

Also looked into this link which shows LEFT OUTER JOIN option. But couldn't figure out to join 3 tables to get codes that is missing in TableA but exist in either TableB Or TableC.

GMB
  • 216,147
  • 25
  • 84
  • 135
akd
  • 6,538
  • 16
  • 70
  • 112
  • Better question is why you need to do this. This smells like a [XY problem](https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem). You seem to be intentionally duplicating data. Is this hiding a schema problem where you don't have (but should) actual relationships among the tables? – SMor Nov 08 '20 at 14:44
  • A view (whether materialized or not) would automatically keep your codes "up-to-date". – SMor Nov 08 '20 at 14:45
  • Legacy is the answer to your question. – akd Nov 08 '20 at 15:28

1 Answers1

1

Here is one option that UNION ALL both tables, then filters out with NOT EXISTS and a correlated subquery.

INSERT INTO TableA (code)
SELECT DISTINCT code 
FROM (SELECT code FROM TableB UNION ALL SELECT code FROM TableC) t
WHERE NOT EXISTS (SELECT 1 FROM TableA ta WHERE ta.code = t.code)

You do want an index on TableA(code) to speed up the subquery.

I am wondering whether deduplicating first, using UNION would offer better performance:

INSERT INTO TableA (Code)
SELECT Code 
FROM (SELECT code FROM TableB UNION SELECT code FROM TableC) t
WHERE NOT EXISTS (SELECT 1 FROM TableA ta WHERE ta.code = t.code)
GMB
  • 216,147
  • 25
  • 84
  • 135
  • Yeap there is unique index in TableA for Code column. And the second query perform better than first one( UNION instead of UNION ALL). And both better than mine. One thing noticed your query results includes 'NULL' result whereas mine didn't. I guess t.code is not null at the end of the query to ignore NULL result is the solution? – akd Nov 08 '20 at 15:20
  • @akd: yes. `where not exists (...) and t.code is not null` will do it. – GMB Nov 08 '20 at 15:28