T-SQL sub select to retrieve rows with columns that have values instead of NULLs

Question

This example is as close as I can demonstrate for the problem I'm trying to solve. For this case the sub-select has some mostly duplicated rows with the only difference being that some have a delvieryDate and some do NOT. If a row of data is available with the date, I want that. If not I'll need to get the row without a date. I tried to fix the sub select with a "Top 1" & "Order by dDate desc" but T-SQL will not allow Order by in a sub-select.

For this example, I want row 9 instead of row 8 since the prior has a date. I also want row 10 since there is no equivalent available with a date.

Code:

Create Table #Customer (
    CustomerId int,
    cName nvarchar(100),
    Address nvarchar(100),
)

INSERT INTO #Customer
VALUES (1, 'Bill', '123 1st St'),
       (2, 'Fred', '111 Market St'),
       (3, 'Lisa', '01 Boulevard')

Create Table #MealType (
    MealTypeId  int,
    mName nvarchar(100)
)
INSERT INTO #MealType
VALUES (1, '1 - Breakfast'), (2, '2 - Lunch'), (3, '3 - Dinner')

Create Table #Food (
    FoodId  int,
    fName nvarchar(100),
    Restaurant nvarchar(100),
    MealType_fk int
)
INSERT INTO #Food
VALUES (3, 'Bacon & Egg Biscut', 'McDs',1),
       (2, 'Happy Meal', 'McDs', 2),
       (1, 'Pizza', 'Dominos', 3),
       (4, 'Santa Fe Salad', 'GrubHub', 2)

Create Table #Delivery (
    DeliveryId  int,
    FoodId_fk int,
    CustomerId_fk int,
    dDate datetime
)
INSERT INTO #Delivery
VALUES (1, 1, 1, CONVERT(date, getdate())),
       (2, 2, 1, CONVERT(date, getdate())),
       (3, 3, 1, CONVERT(date, getdate())),
       (4, 1, 2, CONVERT(date, getdate())),
       (5, 2, 2, CONVERT(date, getdate())),
       (6, 3, 2, CONVERT(date, getdate())),
-- removed data (7, 1, 3, CONVERT(date, getdate())),
       (8, 4, 3, CONVERT(date, getdate())),
       (9, 3, 3, CONVERT(date, getdate())),
      (10, 1, 3, NULL),
      (12, 4, 3, NULL)

select
    c.cName,
    c.Address,
    f.Restaurant,
    f.fName,
    m.mName,
    convert(varchar, d.dDate, 103) as dDate
from #customer c
left join (select distinct
               FoodId_fk,
               CustomerId_fk,
               dDate
           from #Delivery
           ) as d on c.CustomerId = d.CustomerId_fk
join #Food f on d.FoodId_fk = f.FoodId
join #MealType m on f.MealType_fk = m.MealTypeId
order by cName, mName

score 1 · Accepted Answer · answered Jan 11 '20 at 02:10

For this data, one option would be to use aggregation, leveraging the fact that aggregate functions ignore null values:

select
    c.cName,
    c.Address,
    f.Restaurant,
    f.fName,
    m.mName,
    convert(varchar, max(d.dDate), 103) as dDate
from #customer c
join #Delivery d on c.CustomerId = d.CustomerId_fk
join #Food f on d.FoodId_fk = f.FoodId
join #MealType m on f.MealType_fk = m.MealTypeId
group by
    c.cName,
    c.Address,
    f.Restaurant,
    f.fName,
    m.mName
order by cName, mName

Demo on DB Fiddle:

cName | Address       | Restaurant | fName              | mName         | dDate     
:---- | :------------ | :--------- | :----------------- | :------------ | :---------
Bill  | 123 1st St    | McDs       | Bacon & Egg Biscut | 1 - Breakfast | 11/01/2020
Bill  | 123 1st St    | McDs       | Happy Meal         | 2 - Lunch     | 11/01/2020
Bill  | 123 1st St    | Dominos    | Pizza              | 3 - Dinner    | 11/01/2020
Fred  | 111 Market St | McDs       | Bacon & Egg Biscut | 1 - Breakfast | 11/01/2020
Fred  | 111 Market St | McDs       | Happy Meal         | 2 - Lunch     | 11/01/2020
Fred  | 111 Market St | Dominos    | Pizza              | 3 - Dinner    | 11/01/2020
Lisa  | 01 Boulevard  | McDs       | Bacon & Egg Biscut | 1 - Breakfast | 11/01/2020
Lisa  | 01 Boulevard  | GrubHub    | Santa Fe Salad     | 2 - Lunch     | 11/01/2020
Lisa  | 01 Boulevard  | Dominos    | Pizza              | 3 - Dinner    | null

score 1 · Answer 2 · answered Jan 11 '20 at 02:53

It is also possible to start from the output dataset you provide (name it #have). There is a typical quick-and-dirty way to select the max value using ROW_NUMBER() (available in SQL Server 2005 and above).

The advantage of the method is you have the freedom to reduce the number of columns to be compared (defined in partition by). This is not possible in with a simple group by statement because you cannot have output columns other than group or aggregate variables.

The disadvantage is efficiency. The answer of @GMB is obviously preferred whenever applicable. Also consider using cross apply instead for a large dataset.

Code:

with T as (
    select *,
        ROW_NUMBER() over(partition by cName, [Address], Restaurant, fName, mName
                          order by dDate desc) as rn
    from #have
)
select cName, [Address], Restaurant, fName, mName, dDate
from T
where rn = 1
order by cName, mName;

Output:

cName   Address         Restaurant fName                 mName           dDate
Bill    123 1st St      McDs       Bacon & Egg Biscut    1 - Breakfast   11/01/2020
Bill    123 1st St      McDs       Happy Meal            2 - Lunch       11/01/2020
Bill    123 1st St      Dominos    Pizza                 3 - Dinner      11/01/2020
Fred    111 Market St   McDs       Bacon & Egg Biscut    1 - Breakfast   11/01/2020
Fred    111 Market St   McDs       Happy Meal            2 - Lunch       11/01/2020
Fred    111 Market St   Dominos    Pizza                 3 - Dinner      11/01/2020
Lisa    01 Boulevard    McDs       Bacon & Egg Biscut    1 - Breakfast   11/01/2020
Lisa    01 Boulevard    GrubHub    Santa Fe Salad        2 - Lunch       11/01/2020
Lisa    01 Boulevard    Dominos    Pizza                 3 - Dinner      NULL

T-SQL sub select to retrieve rows with columns that have values instead of NULLs

2 Answers2