What is the Snowflake SQL version of this SQL Server solution that counts NULL values in each column of a table? The SQL executes via the SQL Server EXEC
function. I would prefer to not use a stored procedure because of access restrictions but such a solution would be good to have for the future.
I would prefer the solution display the column name in one column and the number of NULL values in another a la:
COLUMN_NAME NULLVALUES
HATS 325
SHOES 0
DOGS 9998
The dynamic SQL (SQL built at runtime) below is the best I have been able to do so far. The name of the column containing the column names will unfortunately be the first column in the table. Ideally, the SQL would execute automatically but experiments with BEGIN and END as well as EXECUTE IMMEDIATE did not work.
-- SUMMARY: Count NULL values for every column of the specified table.
-- This SQL dynamically builds a query that counts the NULL values in every column of a table.
-- One copies and pastes the code in the resulting SQLROW column (can double-click to open query window then copy to clipboard)
USE DATABASE YOUR_DATABASE
-- What database contains the table of interest?
SET DBName = 'YOUR_DATABASE';
-- What is the schema of the table?
SET SchemaName = 'YOUR_SCHEMA';
--What is the table name?
SET TableName = 'YOUR_TABLE';
SELECT ($DBName || '.' || $SchemaName || '.' || $TableName) as FullTablePath;
WITH SQLText AS (
SELECT
ROW_NUMBER() OVER (ORDER BY c.column_name) AS RowNum,
'SELECT ' || '''' || c.column_name || '''' ||
', SUM(CASE WHEN ' || c.column_name || ' IS NULL THEN 1 ELSE 0 END) AS NullValues
FROM ' || $DBName || '.' || $SchemaName || '.' || $TableName AS SQLRow
FROM
information_schema.tables t
INNER JOIN information_schema.columns c
ON c.table_name = t.table_name and c.table_schema = t.table_schema
WHERE
t.table_name = $TableName and t.table_schema = $SchemaName),
Recur AS (
SELECT
RowNum,
TO_VARCHAR(SQLRow) AS SQLRow
FROM
SQLText
WHERE
RowNum = 1
UNION ALL
SELECT
t.RowNum,
r.SQLRow || ' UNION ALL ' || t.SQLRow
FROM
SQLText t
INNER JOIN Recur r ON t.RowNum = r.RowNum + 1
),
no_dupes as (
select * from Recur where RowNum = (SELECT MAX(RowNum) FROM Recur)
)
select SQLRow from no_dupes