1

I am attempting to retrieve the most recent address associated with a name (as there are usually multiple addresses per name). I have created the following query with a subquery in my join. I am able to pull back one address per party as I wanted but the address/city doesn't always match up or the address that's returning is not the most recent address (per the date associated). Am I writing the subquery with MAX wrong? Is there a way to make sure the correct city goes with the correct address?

select 
    [demographics].[DOB],
    Address1,
    Address2,
    City,
    State,
    Zip

from [demographics]
    left join (select
                    [address].[demographic_ID], 
                    max([address].[address1]) as Address1,
                    max([address].[address2]) as Address2,
                    max([address].[city]) as City,
                    max([states].[state_title]) as 'State',
                    max([dbo].[address].[zip]) as Zip
                from [address]
                    inner join [states] on [address].[state_ID] = [states].[state_ID]
                group by [address].[demographic_ID]) as AddressSub 
        on [demographics].[demographic_ID] = AddressSub.[demographic_ID];
xQbert
  • 34,733
  • 2
  • 41
  • 62
Samantha
  • 23
  • 5
  • Yes, you are writing it wrong. Think about how you might use TOP 1 instead of MAX. – Tab Alleman May 02 '16 at 14:27
  • I would think it would be easier to use an analytic such as `Row_Number over (partition by Address.PK order by Date? ID? [what field determines most recent] desc)` and then limit it to the row_number 1. – xQbert May 02 '16 at 14:29
  • Yes, by the way, you need to have a column that determines which address is the most recent, which I don't see being used in your query. – Tab Alleman May 02 '16 at 14:29
  • Thank you, Tab! That worked to keep my address together and then i ordered by my "[address1] desc" to get it to pull with the most recent address per the date. – Samantha May 02 '16 at 14:33
  • Possible duplicate of [SQL join: selecting the last records in a one-to-many relationship](http://stackoverflow.com/questions/2111384/sql-join-selecting-the-last-records-in-a-one-to-many-relationship) – Tab Alleman May 02 '16 at 14:34

4 Answers4

0

To do this right I would need to know the PK of Address table and how you determine "Most recent"

SELECT
    [demographics].[DOB],
    Address1,
    Address2,
    City,
    State,
    Zip

FROM [demographics]
LEFT JOIN (SELECT
           Row_number() over (partition by Demographic_ID ORDER BY Demographic_ID Desc) RN
           [address].[demographic_ID], 
           [address].[address1] as Address1,
           [address].[address2] as Address2,
           [address].[city] as City,
           [states].[state_title] as 'State',
           [address].[zip] as Zip
           FROM [address]
           INNER JOIN [states] 
              on [address].[state_ID] = [states].[state_ID]
           GROUP BY [address].[demographic_ID]) as AddressSub 
  on [demographics].[demographic_ID] = AddressSub.[demographic_ID]
 and  AddressSub.RN = 1
xQbert
  • 34,733
  • 2
  • 41
  • 62
  • You want to partition by `demographic_id`. – Gordon Linoff May 02 '16 at 14:34
  • Yeah, I can see that now in the join. But still don't know how to order it. unless we assume the highest demographic_ID is the most recent... – xQbert May 02 '16 at 14:37
  • . . You then want to order by some sort of creation date or the primary key (if you know it is an identity column). – Gordon Linoff May 02 '16 at 14:39
  • This worked successfully but is pulling by oldest date, not most recent date. The PK for the address table is Address_ID, however, somehow I have to be able to order by date desc, possibly? – Samantha May 02 '16 at 14:59
  • Got it; in the 'row_number' line i ordered by [changed_date] and it worked! Thank you all for you help! – Samantha May 02 '16 at 15:13
0

the most recent would have be the last entry in the address column

select 
[demographics].[DOB],
Address1,
Address2,
City,
State,
Zip

from [demographics]
left join (select top 1 max([address.EntryDate)
                [address].[demographic_ID], 
                [address].[address1] as Address1,
                [address].[address2] as Address2,
                [address].[city] as City,
                [states].[state_title] as 'State',
                [dbo].[address].[zip] as Zip
            from [address]
                inner join [states] on [address].[state_ID] = [states].[state_ID]
            group by [address].[demographic_ID], 
                [address].[address1] as Address1,
                [address].[address2] as Address2,
                [address].[city] as City,
                [states].[state_title] as 'State',
                [dbo].[address].[zip] as Zip

    on [demographics].[demographic_ID] = AddressSub.[demographic_ID];
0

If you look at your sub query what you are actually asking for is the maximum (as these are strings alphabetically maximum) field for each of the columns. This is unlikely to return fields from the same address row. For instance if you had two rows both relating to the same demographic_ID one with an address_1=Z and address_2=A and another one with address_1=A and address_2=Z you would get Z and Z returned (address_1 from the first and address_2 from the second). Which I don't think is what you are aiming for.

Firstly you could make things a little easier by changing the database slightly if you have that option. You could put in a most recentAddress id in your demographics which you could update and use to link directly to the address you want. Or you could add in a column into your address which stores the created time or edited time and use this to get the most recent time.

Alternatively if you do not want to change your database you could do as suggested above and use rownumber as the identifier for the most recent item in address. You would first need to return the demographic_id and maximum(rownumber) in a sub query and then return the remaining details of the address in a second sub query.

Graham
  • 41
  • 1
  • 9
0

One way to approach this in SQL Server is to use outer apply (in other databases, this is called a "lateral join"). Assuming you have some method for determining the most recent address, you can do:

select d.*, a.*
from [demographics] d outer apply
     (select a.*, s.state_title as state
      from address a inner join
           states s
           on a.[state_ID] = s.[state_ID]
      where a.demographic_ID = d.demographic_ID
      order by ?? desc
     ) a;

(I'm leaving out the details on the specific columns.)

The ?? represents the column that defines the "most recent" -- for instance, an identity primary key or a record creation date. In this case, outer apply is basically a correlated subquery that allows you to return more than one column.

Note: I strongly recommend using table aliases. They make the query easier to write and to read.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786