0
SELECT [Cole_Crops], 
       PERCENT_RANK() OVER (PARTITION BY HUC02 
                            ORDER BY [Cole_Crops]) AS Cole_Crops_PCT_RANK], 
       [Row_Crops], 
       PERCENT_RANK() OVER (PARTITION BY HUC02 
                            ORDER BY [Row_Crops]) AS Row_Crops_PCT_RANK]            
FROM V012_CDLMAX_09

How do you exclude NULL values from affecting the PERCENT_RANK in a query with multiple columns to be ranked?

SQL Server 2012 PERCENT_RANK() treats NULLS as a real value in the distribution. If there were one column to be ranked, you can simply add a where clause (i.e., where Cole_Crops IS NOT NULL).

Giorgos Betsos
  • 71,379
  • 9
  • 63
  • 98
JacamoFinane
  • 65
  • 2
  • 11
  • 1
    I don't think this can be done directly in the select statement. All of the windowed functions are run against the entire query result set. https://msdn.microsoft.com/en-GB/library/ms189461.aspx – David Rushton Dec 04 '15 at 15:45

4 Answers4

5

Here's an easy fix, use a case statement in your partition clause.

CASE 
    WHEN COLUMN_1 IS NULL THEN NULL 
    ELSE PERCENT_RANK() OVER 
        (PARTITION BY 
            CASE 
                WHEN COLUMN_1 IS NULL THEN 0 
                ELSE 1 
            END 
         ORDER BY COLUMN_1) 
END AS RANK_COLUMN_1

This will rank nulls and non-nulls individually, but the wrapping case statement will display null values as nulls.

Problem solved!

Dan Oberlam
  • 2,435
  • 9
  • 36
  • 54
Wes
  • 51
  • 1
  • 3
0

You can use a case statement in your Percent_Rank ORDER BY. You can make NULL values whatever you want them to be. Below it is treating all NULL values as 0

SELECT [Cole_Crops], 
       PERCENT_RANK() OVER (PARTITION BY HUC02 
                            ORDER BY CASE WHEN [Cole_Crops] IS NULL THEN 0 ELSE [Cole_Crops] END) AS Cole_Crops_PCT_RANK], 
       [Row_Crops], 
       PERCENT_RANK() OVER (PARTITION BY HUC02 
                            ORDER BY CASE WHEN [Row_Crops] IS NULL THEN 0 ELSE [Row_Crops] END) AS Row_Crops_PCT_RANK]            
FROM V012_CDLMAX_09
JamieD77
  • 13,796
  • 1
  • 17
  • 27
  • 1
    Thank you. To clarify, I am hoping to exclude NULLS from the distribution entirely so they don't affect the relative ranking. For instance, n=100 with NULLS and n=75 without NULLS. Converting NULL to another value keeps n=100 which changes the percent ranking of the values being examined. I hope I understood your idea completely. – JacamoFinane Dec 04 '15 at 15:35
  • 1
    ISNULL([FieldName], 0) would achieve the same result with less typing. But I'm not sure this is what the OP wants. @JacamoFinane are you trying to rank only the non-null values in the column? Are you looking for null records to have no rank at all? – David Rushton Dec 04 '15 at 15:38
  • yes, desire NULL records to have no rank at all. Can be achieved when a single value in the query is to be ranked by using a where clause but troubling when there are 30+ values requiring ranking. Am hoping to avoid separate queries for each ranking. Thank you – JacamoFinane Dec 04 '15 at 15:43
  • Sorry you can't do this without working around. See this open Connect ticket for more: https://connect.microsoft.com/SQLServer/feedback/details/532474 – David Rushton Dec 04 '15 at 19:26
0

You can try adding a condition in PARTITION BY clause:

SELECT [Cole_Crops], 
   PERCENT_RANK() OVER (PARTITION BY [Cole_Crops] is not null, HUC02 
                        ORDER BY [Cole_Crops]) AS Cole_Crops_PCT_RANK], 
   [Row_Crops], 
   PERCENT_RANK() OVER (PARTITION BY [Row_Crops] is not null, HUC02 
                        ORDER BY [Row_Crops]) AS Row_Crops_PCT_RANK]
FROM V012_CDLMAX_09
SHR
  • 7,940
  • 9
  • 38
  • 57
0

Just a little revision to Wes's answer. It should be:

CASE 
    WHEN COLUMN_1 IS NULL THEN NULL 
    ELSE PERCENT_RANK() OVER 
        (PARTITION BY 
             HUC02
            END 
         ORDER BY COLUMN_1) 
END AS RANK_COLUMN_1
Ruli
  • 2,592
  • 12
  • 30
  • 40
Laueby
  • 1