24

I'm trying to create a query that will return all the rows that have a null value across all but 1 column. Some rows will have more than one null entry somewhere. There is one column I will want to exclude, because at this moment in time all of the entries are null and it is the only column that is allowed to have null values. I am stuck because I don't know how to include all of the columns in the WHERE.

SELECT *
FROM Analytics
WHERE * IS NULL

Alternatively, I can do a count for one column, but the table has about 67 columns.

SELECT COUNT(*)
FROM Analytics
WHERE P_Id IS NULL
Martin Smith
  • 438,706
  • 87
  • 741
  • 845
Snake_Plissken
  • 390
  • 1
  • 4
  • 14
  • 1
    You have to do it manually. You need to list all the columns. You can generate this with TSQL and execute it by looking up the table schema then generating TSQL that lists all 67 columns. But there's not a simple way to check values against multiple columns in a single statement such as `* IS NULL` – Eli Gassert Jan 23 '13 at 20:39
  • 6
    What RDBMS and version are you using? – Lamak Jan 23 '13 at 20:48
  • 1
    This is SQLServer 2005 :'[. I figured I might have to do it individually. – Snake_Plissken Jan 23 '13 at 21:09

5 Answers5

16

In SQL Server you can borrow the idea from this answer

;WITH XMLNAMESPACES('http://www.w3.org/2001/XMLSchema-instance' as ns)
SELECT *
FROM   Analytics
WHERE  (SELECT Analytics.*
        FOR xml path('row'), elements xsinil, type
        ).value('count(//*[local-name() != "colToIgnore"]/@ns:nil)', 'int') > 0

SQL Fiddle

Likely constructing a query with 67 columns will be more efficient but it saves some typing or need for dynamic SQL to generate it.

Community
  • 1
  • 1
Martin Smith
  • 438,706
  • 87
  • 741
  • 845
1

Depending on which RDBMS you're using, I think your only option (rather than explicitly saying WHERE col1 IS NULL and col2 IS NULL and col3 IS NULL ...) would be to use Dynamic SQL.

For example, if you want to get all the column names from a SQL Server database, you could use something like this to return those names:

SELECT
     name
FROM
     sys.columns
WHERE
     object_id = OBJECT_ID('DB.Schema.Table')

You could use FOR XML to create your WHERE clause:

SELECT Name + ' IS NULL AND ' AS [text()]
FROM sys.columns c1
WHERE     object_id = OBJECT_ID('DB.Schema.Table')
ORDER BY Name
FOR XML PATH('')

Hope this helps get you started.

Good luck.

sgeddes
  • 62,311
  • 6
  • 61
  • 83
1

I don't have such a table to test, assuming there is no 'x' as data in any field, I think this should work on Sql-Server; (DEMO)

NOTE: I have filtered keyColumn as c.name != 'keyColumn'

DECLARE @S NVARCHAR(max), @Columns VARCHAR(50), @Table VARCHAR(50)

SELECT @Columns = '66', --Number of cols without keyColumn
       @Table = 'myTable'

SELECT @S =  ISNULL(@S+'+ ','') + 'isnull(convert(nvarchar, ' + c.name + '),''x'')'  
FROM sys.all_columns c 
WHERE c.object_id = OBJECT_ID(@Table) AND c.name != 'keyColumn'

exec('select * from '+@Table+' where ' + @S + '= replicate(''x'',' + @Columns + ')')
Kaf
  • 33,101
  • 7
  • 58
  • 78
0

For the SQL beginner user like me, all the query above seem so hard to digest. I think the quickest way is just to write query for all 67 columns. It's just basically a copy&paste process. E.g:

select count(*) from user where id is null or 
 name is null or 
 review_count is null or
 yelping_since is null or 
 useful is null or 
 funny is null;
0

This is for SQLite, so slightly different, but I had the same problem and I ended up writing this small Python script findnulls.py to find all null values in a database:

import sqlite3
import sys

def main():
    dbfile = sys.argv[1]
    conn = sqlite3.connect(dbfile)
    c = conn.cursor()

    tables = c.execute("SELECT name FROM sqlite_master WHERE type='table';").fetchall()

    for table in tables:
        tablename = table[0]
        cols = c.execute("PRAGMA table_info({0})".format(tablename)).fetchall()

        for col in cols:
            colname = col[1]
            nullvals = c.execute("SELECT COUNT(*) FROM {0} WHERE {1} IS NULL;".format(tablename, colname)).fetchall()
            if nullvals[0][0] != 0:
                print("found {0} nulls in table: {1} column {2}".format(nullvals[0][0], tablename, colname))

main()

Where you would run it like this:

findnulls.py somedatabase.db

And the output looks like this:

found 1 nulls in table: Channels column MessageId
found 1 nulls in table: Channels column MessageChannel
found 1 nulls in table: Channels column SignalType    
found 7 nulls in table: Channels column MinVal        
found 7 nulls in table: Channels column MaxVal        
found 9 nulls in table: Channels column AvgVal        
found 9 nulls in table: Channels column Median
found 9 nulls in table: Channels column StdDev
skelliam
  • 522
  • 3
  • 11