Is there a way to get a count of distinct values from every table and column in SQL Server.
I've tried using a cursor for this, but that seems insufficient.
Is there a way to get a count of distinct values from every table and column in SQL Server.
I've tried using a cursor for this, but that seems insufficient.
I've got to agree with Sean and say that this is going to be horrifically slow, but if you really want to do it, then I'm not going to stop you.
Something like this could be used as a starting point if you specifically don't want to use a cursor. This took just under a minute to look at a small database I've got with 10 tables in it. The largest table has just a few million rows in it. No matter what, you're going to be doing some sort of iteration, whether that's a cursor or explicitly reading against the table for each column.
Also, if you want to do something like this, you'll likely need to accommodate for things... like you're not going to be able to use COUNT on xml columns. Like I said, it's a starting point.
DECLARE @cmd VARCHAR(MAX)
SELECT @cmd =
STUFF (
(
SELECT
' union SELECT ''['+ SCHEMA_NAME(st.schema_id) + '].[' + st.name +']'' as [Object], ''[' + sc.name + ']'' as [Column], COUNT(distinct [' + sc.name + ']) as [Count] FROM [' + SCHEMA_NAME(st.schema_id) + '].[' + st.name + ']'
FROM sys.tables st
JOIN sys.columns sc
ON sc.object_id = st.object_id
JOIN sys.dm_db_partition_stats ddps
ON ddps.object_id = sc.object_id
WHERE
ddps.row_count > 0
FOR XML PATH('')
),1,6,''
)
EXECUTE (@cmd)