1

I have two tables (primary and secondary) and we need to do a row and column-level reconciliation between these two tables and get a summary of the differences between these tables.

Table A:

col_A col_B col_C
One Two Three
Four Five Six
Seven Eight Nine

Table B:

col_A col_B col_C
One Two Three
Four Five ABC
Seven Eight Nine
Nine Eight Nine

In the above table col_A is the primary key column. I want to compare Table A and Table B and produce results like below.

Matched Rows: 2

Unmatched Rows: 1

Columns not matching: col_C (sample key: Four)

Rows Present in Table A but not in B: 0

Rows Present in Table B but not in A: 1 ( Sample key:Nine)

Generally, Table A and Table B have approx. billion rows. What would be the efficient way to do it in Snowflake.

Pops
  • 69
  • 1
  • 3

1 Answers1

0

For row comparison, consider the MINUS/EXCEPT set operator.

SELECT count(*) as countOfRowsInTable1NotInTable2 FROM 
( 
    SELECT col_A, col_B, col_C FROM table1
    MINUS 
    SELECT col_A, col_B, col_C FROM table2
)sub;

You can reverse the order of the inner SELECT statments to get the same count for tableB. You can also perform a SELECT * instead of SELECT COUNT(*) if you want to see which rows exist in one table but not the other.

JNevill
  • 46,980
  • 4
  • 38
  • 63