0
  1. I have two csv files made of a table of a database.
  2. One is one month older backup than the other.
  3. I need to find differences in them by comparing the two csv:s
  4. Normally I'd use the first column, the id, as the unique identifier, as the key to compare the rows with, but in this case it is not available.
  5. The table has dozens of columns so surely there would be one that has only unique values. Meaning that no two or more rows have the same value in that column.

Is there an SQL query or any way to find out a column in a table that doesn't have duplicates on any row? Hence, that column could be used as the unique identifier of a row. The database is SQL Server 2012.

Steve Waters
  • 3,348
  • 9
  • 54
  • 94
  • 1
    Possible duplicate of [sql query to return differences between two tables](https://stackoverflow.com/questions/2077807/sql-query-to-return-differences-between-two-tables) – Syntax Error Feb 20 '18 at 11:24
  • 1
    Possible duplicate of [How to find out whether a table has some unique columns](https://stackoverflow.com/questions/6941049/how-to-find-out-whether-a-table-has-some-unique-columns) – Cam Feb 20 '18 at 11:26
  • 1
    Possible duplicate of [How to find out whether a table has some unique columns](https://stackoverflow.com/questions/6941049/how-to-find-out-whether-a-table-has-some-unique-columns) – Udayraj Deshmukh Feb 20 '18 at 11:30

2 Answers2

0

You can write a script to group by all the columns individually and see the count. You can find the fields which are most distinct. You can combine the fields to make it more distinct. Good Luck!

Venkat Rao
  • 173
  • 1
  • 10
0

I used this script for similar purpose:

USE [Database]

DECLARE @name sysname
DECLARE @cntAll int
DECLARE @cntDist int
DECLARE @cntNull int
DECLARE @err int
DECLARE @stm nvarchar(max)
DECLARE @tblName sysname

SET @tblName = 'Table'

-- All rows
SET @stm = N'SELECT @cntAll = COUNT(*) FROM dbo.[' + @tblName + ']'
EXEC @err = sp_executesql @stm, N'@cntAll int OUTPUT', @cntAll OUTPUT
IF @err <> 0 BEGIN
    RETURN
END

-- Distinct rows by column
DECLARE columns_tables CURSOR GLOBAL FORWARD_ONLY READ_ONLY FOR
SELECT [name] FROM sys.columns WHERE object_id = OBJECT_ID('dbo.[' + @tblName + ']') 

OPEN columns_tables
FETCH NEXT FROM columns_tables INTO @name
WHILE (@@FETCH_STATUS = 0) BEGIN

    SET @stm = 
        N'SELECT @cntDist = COUNT(DISTINCT [' + @name + ']) '+
        'FROM dbo.[' + @tblName + '] '+
        'WHERE [' + @name + '] IS NOT NULL'
    EXEC @err = sp_executesql @stm, N'@cntDist int OUTPUT', @cntDist OUTPUT
    IF @err <> 0 BEGIN
        CLOSE columns_tables
        DEALLOCATE columns_tables
        BREAK
    END

    SET @stm = 
        N'IF EXISTS (' +
            'SELECT ([' + @name + ']) '+
            'FROM dbo.[' + @tblName + '] '+
            'WHERE [' + @name + '] IS NULL'+
        ') SET @cntNull = 1 '+
        'ELSE SET @cntNull = 0'
    EXEC @err = sp_executesql @stm, N'@cntNull int OUTPUT', @cntNull OUTPUT
    IF @err <> 0 BEGIN
        CLOSE columns_tables
        DEALLOCATE columns_tables
        BREAK
    END

    IF (@cntAll = @cntDist) AND (@cntNull = 0) BEGIN
        PRINT 'Possible column ' + @name
    END

    FETCH NEXT FROM columns_tables INTO @name
END
CLOSE columns_tables
DEALLOCATE columns_tables
Zhorov
  • 28,486
  • 6
  • 27
  • 52