I have three tables in SQL Server 2014, each table has millions of data and keep growing. I am trying to find discrepancies between the tables, for example:
DECLARE @ab TABLE
(
k1 int,
k2 int,
val char(1)
)
DECLARE @cd TABLE
(
k1 int,
k2 int,
val char(1),
add_cd varchar(50)
)
DECLARE @ef TABLE
(
k1 int,
k2 int,
val char(1),
add_ef varchar(50)
)
INSERT INTO @ab VALUES(1,1,'a'), (2, 2, 'c'), (3, 3, 'c'), (4, 4, 'd'), (5, 5, NULL), (7, 7, 'g')
INSERT INTO @cd VALUES(1,1,'a', 'DSFS'), (2, 2, 'b', 'ASDF'), (4, 4, NULL, 'SDFE')
INSERT INTO @ef VALUES(1,1,'a', 'SD1245'), (2, 2, 'b', 'EW3464'), (3, 3, 'd', 'DF3452'),(4, 4, 'd', 'FG4576'), (6, 6, 'e', 'RT3453')
The common Key columns for all three sets are k1 and k2, I would like pull only the discrepancies either the value of "val" should be different or the key combination shouldnt exist in all three sets. No need to compare the additional columns(add_cd and add_ef) just needed in the final result. The desired result is:
k1 K2 val k1 k2 val add_cd k1 k2 val add_ef
2 2 c 2 2 b ASDF 2 2 b EW3464
3 3 c NULL NULL NULL NULL 3 3 d DF3452
4 4 d 4 4 NULL SDFE 4 4 d FG4576
5 5 NULL NULL NULL NULL NULL NULL NULL NULL NULL
NULL NULL NULL NULL NULL NULL NULL 6 6 e RT3453
7 7 g NULL NULL NULL NULL NULL NULL NULL NULL
I tried the below query, it gives desired result but works only with few thousands not with millions. Created indexes for the key columns but sill I see it uses table scan. Can anyone please advise on this?
SELECT a.*, c.*, e.*
FROM @ab a
FULL OUTER JOIN @cd c ON a.k1 = c.k1
AND a.k2 = c.k2
FULL OUTER JOIN @ef e ON (c.k1 = e.k1
AND c.k2 = e.k2 )
OR (a.k1 = e.k1
AND a.k2 = e.k2 )
WHERE (a.k1 IS NULL OR c.k1 IS NULL OR e.k1 IS NULL)
OR (ISNULL(a.val, '') != ISNULL(c.val, ''))
OR (ISNULL(c.val, '') != ISNULL(e.val, ''))
OR (ISNULL(a.val, '') != ISNULL(e.val, ''))