1

I am looking for a dynamic MySQL query which would count the distinct values in each column of a table and which would also tell me which of the columns contain Null values.
Here is the sample table, I used 'db_name' as database name and 'table_name' as table name:

+------+------+------+------+------+
| Col1 | Col2 | Col3 | Col4 | Col5 |
+------+------+------+------+------+
| a    | d    | j    | o    | q    |
| b    | e    | k    | o    | r    |
| c    | f    | l    | o    | NULL |
| a    | g    | NULL | p    | t    |
| b    | h    | m    | NULL | r    |
| a    | i    | n    | p    | s    |
+------+------+------+------+------+

This is the result set that I would like to get:

+----------+---------------+---------------+
| Col Name | Unique values | Contains Null |
+----------+---------------+---------------+
| Col1     |             3 | FALSE         |
| Col2     |             6 | FALSE         |
| Col3     |             6 | TRUE          |
| Col4     |             3 | TRUE          |
| Col5     |             5 | TRUE          |
+----------+---------------+---------------+

So far this is what I have managed to figure out:

-- list all the column names

SELECT COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'table_name';

-- count the distinct values in a column

SELECT COUNT(DISTINCT Col1) Col1 
FROM table_name;

-- tell if a column contains any Null

SELECT
  (CASE WHEN (SUM(CASE WHEN Col1 IS NULL THEN 1 ELSE 0 END)) > 0 THEN 'TRUE' 
  ELSE 'FALSE' END) 'Contains Null'
FROM table_name;

-- combining the queries

SELECT
  (SELECT COLUMN_NAME
  FROM INFORMATION_SCHEMA.COLUMNS
  WHERE TABLE_NAME = 'table_name' AND COLUMN_NAME = 'Col1') 'Col Name', 
  (SELECT COUNT(DISTINCT Col1) 
  FROM table_name) 'Unique values',
  (SELECT (CASE WHEN (SUM(CASE WHEN Col1 IS 
  NULL THEN 1 ELSE 0 END)) > 0 THEN 'TRUE' ELSE 'FALSE' END)
  FROM table_name) 'Contains Null';

Now, I assume I would need to build a loop that goes through each column and unifies the records returned by the query or inserts them into a new table. The problem is, I am relatively new to SQL and I am not really familiar with loops and variables yet.

I found a few questions similar to mine, but none of them gave me a clear answer:

SQL Server count number of distinct values in each column of a table

Return column name and distinct values

SQL: count number of distinct values in every column

Vadim Kotov
  • 8,084
  • 8
  • 48
  • 62
gyrgy2
  • 11
  • 3
  • This is difficult in sql because the resultsets for your fields will potentially be very different and MySQL handles data records, as opposed to columns. This means that the output has to be constructed row by row. Since MySQL does not support full outer joins, it is a bit tricky to construct this output. I think it would be a lot easier to assemble the expected results in an application, as opposed to MySQL. – Shadow Mar 13 '18 at 12:40

1 Answers1

0

I just needed thing similiar to this question (get count of all table distinct values and easy way to get it from any table using only SQL) so I've made it on this way. Hopefully it may help somebody in cases like this one.

SET @processedtable := 'myprecioustablename';
SET @columnnames := (
    SELECT GROUP_CONCAT(COLUMN_NAME)
    FROM INFORMATION_SCHEMA.COLUMNS
    WHERE TABLE_NAME = @processedtable);
SET @qrypartcount := REPLACE(@columnnames, ',','), COUNT(DISTINCT ');
SET @validquery := CONCAT("SELECT COUNT(DISTINCT ", @qrypartcount, ") FROM ", @processedtable);
PREPARE stmt FROM @validquery;
EXECUTE stmt;
Ozi
  • 1