8

How can I count non-null entries by field/column? I see several answers to count by row but can't hack how to do so for columns.

Input:

╔════╦════════╦════════╦════════╗
║ id ║ field1 ║ field2 ║ field3 ║
║ 1  ║ do     ║ re     ║ me     ║
║ 2  ║ fa     ║        ║ so     ║
║ 3  ║ la     ║ te     ║        ║
║ 4  ║ da     ║ re     ║        ║
╚════╩════════╩════════╩════════╝

output:

id       4
field1   4
field2   3
field3   2

I'm trying to get a gauge on field usage in a very dirty database I am migrating. There's about 50 columns in this database so I am looking for an approach that doesn't involve typing out each column name.

I might also have to extend that search to non-NULL & is-not-empty & ≠ 0 & ≠ "no" because of inconsistencies in data storage — some fields were never used but auto-filled with "no".

This answer looks close to what I need but generates an SQL error and I don't have enough reputation to comment: Count number of NULL values in each column in SQL

Community
  • 1
  • 1
Slam
  • 3,125
  • 1
  • 15
  • 24

3 Answers3

3

Just use count():

select count(field1), count(field2), count(field3)
from table t;

That is what count() does -- it counts non-NULL values.

If you have an aversion to typing out names, then use the metadata tables (typically information_schema.columns) to get the column names. You can generate the SQL as a query, or copy the column names into a spreadsheet to generate the code.

EDIT:

You can generate the code using:

select group_concat('count(', column_name, ')' separate ', ')
from information_schema.columns
where table_name = <whatever> and table_schema = <whatever2>;

Note this uses the little known ability of group_concat() to take multiple string arguments.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • @gordon linoff, i think user wants row wise data, not column wise. – Ankit Bajpai Mar 16 '15 at 12:02
  • I get an SQL syntax error when I run this. Am I supposed to replace `table t` with my table name somewhere? Also with 50 columns names I am looking for a programmatic way to run this WITHOUT typing each of these names by hand. – Slam Mar 16 '15 at 12:14
  • @Gorden, "If you have an aversion to typing out names, then use the metadata tables (typically information_schema.columns) to get the column names. You can generate the SQL as a query..." I know how to use count(). Figuring out how to get column names via SQL query is what i am missing. Could you edit your answer to include that? – Slam Mar 16 '15 at 12:29
  • @Gordon's answer ended up being the correct one and I was too new at the time to understand it. And kudos for the meta-answer of using SQL to output new query language. – Slam Sep 26 '18 at 23:22
2
SELECT count(field1) as cnt ,'field1' as field from tbl where field1 IS NOT NULL
UNION all
SELECT count(field2) as cnt ,'field2' as field from tbl where field2 IS NOT NULL
union all 
SELECT count(field3) as cnt ,'field3' as field from tbl where field3 IS NOT NULL

IF THERE ARE ONLY 3 FIXED FIELD ABOVE SQL WILL WORK

sumit
  • 15,003
  • 12
  • 69
  • 110
0

Here is sqlfiddle

SELECT
  COUNT(id),
  SUM(CASE WHEN field1 IS NOT NULL THEN 1 ELSE 0 END), 
  SUM(CASE WHEN field2 IS NOT NULL THEN 1 ELSE 0 END), 
  SUM(CASE WHEN field3 IS NOT NULL THEN 1 ELSE 0 END)
FROM table1;
Alex
  • 16,739
  • 1
  • 28
  • 51