0

From the following three tables, I need to:

  1. Return one row for each record in People, don't duplicate the person if they have more than one associated contact.
  2. Include associated ContactValue, where ContactType is HOMEPHONE, and the start of the date range is prior to now, and the end of the date range is later than now or null. If a person does not have a home phone, still show the person, but show NULL as the value for ContactValue.

People

PersonId (PK, int)
FullName (nvarchar)

PeopleContacts

PersonId (PK, int)
ContactId (PK, int)
StartValidDate (PK, datetime)
EndValidDate (datetime, null)

Contacts

ContactId (PK, int)
ContactType (nvarchar)
ContactValue (nvarchar)

The People table contains a unique list of people. PeopleContacts may contain several contact associations for each person, for different contact types, and a date range that the contact is good for. Contacts contains a list of contact values of certain types. For example, ContactType of "WORKPHONE" and ContactValue of "(555) 555 5555".

Requirements:

  1. Return one row for each record in People, don't duplicate the person if they have more than one associated contact.
  2. Include associated ContactValue, where ContactType is HOMEPHONE, and the start of the date range is prior to now, and the end of the date range is later than now or null. If a person does not have a home phone, still show the person, but show NULL as the value for ContactValue.

If People contained three rows, with:

1     Jones, Bob
2     Smith, Bob
3     Smith, Fred

And PeopleContacts contained four rows, with:

1     4    01/01/2012     NULL
2     1    01/01/2012     02/01/2012
2     2    02/02/2012     NULL
3     3    01/01/2012     NULL
3     4    01/01/2012     NULL

And Contacts contained four rows, with:

1     HOMEPHONE     (555) 555 5252
2     HOMEPHONE     (666) 666 6666
3     HOMEPHONE     (777) 777 7777
4     WORKPHONE     (555) 555 5555

Output from the correct query, if run after 02/02/2012, should look like:

FullName     ContactValue
--------     ------------
Jones, Bob   NULL
Smith, Bob   (666) 666 6666
Smith, Fred  (777) 777 7777

Jones, Bob has an association to a contact, but it's to a WORKPHONE, so ContactValue should be NULL. Smith, Bob has associations to two HOMEPHONE records, but only one, (666) 666 6666, still has a valid date range. Smith, Fred has an association to both a HOMEPHONE and a WORKPHONE.

Nick Silberstein
  • 833
  • 2
  • 8
  • 24
  • So the logic for multiple contacts is that only the first one is returned? – Oded Jul 06 '12 at 20:50
  • Correct. If a person had multiple associated HOMEPHONE contacts that were valid for a particular point in time, we'd only want to see one of them. In practice, the application should enforce that a person should only have one contact of a particular type for a given date range. – Nick Silberstein Jul 06 '12 at 20:54
  • Having both both `StartValidDate` and `EndValidDate` is convenient for querying but that way it is hard to enforce that a person have no more than one contact at any particular point of time. Consider getting rid of `EndValidDate`, that would let you create a unique constraint on `PersonId, StartValidDate` in `PeopleContacts`. – Andriy M Jul 08 '12 at 15:13

4 Answers4

1

I would solve this using a Common Table Expression - This should be more efficient than using APPLY which would be evaluated for each row. Using a CTE, you would just get a single result set and then do a left outer join on that.

I would use RANK to make sure you only get one contact back ... by ordering the CTE and joining the CTE by the first rank you will just get a single contact.

The SQL would look something like this:

WITH [CTE] AS (
SELECT
        pc.PersonId
    ,   pc.ContactValue
    ,   RANK() OVER (PARTITION BY PersonId ORDER BY ContactValue DESC) AS [Seed]
FROM
    PeopleContacts pc
    INNER JOIN  Contacts c ON (pc.ContactId = c.ContactId AND c.ContactType='HOME')
    WHERE pc.StartValidDate <= GETDATE()
    AND (pc.EndValidDate > GETDATE() OR pc.EndValidDate IS NULL)
)

SELECT      FullName
        ,   ISNULL(CTE.ContactValue, '') 
FROM People p
LEFT OUTER JOIN CTE ON (CTE.PersonId = p.PersonID AND CTE.Seed = 1)

(This code is untested, if you set up a sql fiddle then I'd be happy to get it working)

soupy1976
  • 2,787
  • 2
  • 26
  • 34
  • What do you think of tschmit007's solution? Both his and yours seem about equal in terms of complexity, though perhaps less readable than the OUTER APPLY method? Your untested code ran great, thanks! – Nick Silberstein Jul 06 '12 at 23:06
  • 2
    @NickSilberstein [see here](http://stackoverflow.com/questions/706972/difference-between-cte-and-subquery) for a discussion of subqueries vs a CTE used as a subquery. In this case (CTE is used just like a pre-defined subquery) it's probably just personal preference as to whether you prefer it defined up front, or within the main query. – soupy1976 Jul 07 '12 at 04:11
0

I believe that the following query does what I need, but I'd love feedback or a better approach to doing this:

SELECT FullName, c.ContactValue FROM People p
OUTER APPLY
    (
    SELECT TOP 1 ContactValue FROM PeopleContacts pc 
    LEFT OUTER JOIN Contacts c ON pc.ContactId = c.ContactId AND c.ContactType='HOME'
    WHERE pc.PersonId = p.PersonId
    AND pc.StartValidDate <= GETDATE()
    AND (pc.EndValidDate > GETDATE() OR pc.EndValidDate IS NULL)
    ORDER BY ContactValue DESC) c
Nick Silberstein
  • 833
  • 2
  • 8
  • 24
0

the following code seems correct:

declare @tPeoples table (
    id int,
    name nvarchar(50)
)

declare @tContacts table (
    id int,
    type char(10),
    value nvarchar(50)
)

declare @tPeopelContact table (
    peopleId int,
    contactId int,
    starting datetime2,
    ending datetime2
)

insert into @tPeoples (id, name) values
(1, 'Jones, Bob'),
(2, 'Smith, Bob'),
(3, 'Smith, Fred')

insert into @tContacts (id, type, value ) values
(1,     'HOMEPHONE',  '(555) 555 5252'),
(2,     'HOMEPHONE',  '(666) 666 6666'),
(3,     'HOMEPHONE',  '(777) 777 7777'),
(4,     'WORKPHONE',  '(555) 555 5555'),
(5,     'HOMEPHONE',  '(000) 123 5252')

insert into @tPeopelContact (peopleId, contactId, starting, ending) values
(1,     4,    '20120101',    NULL),
(2,     1,    '20120101',   '20120201'),
(2,     2,    '20120201',     NULL),
(3,     3,    '20120101',     NULL),
(3,     4,    '20120101',     NULL)


declare @currentdate datetime
declare @type char(10)

set @currentdate = '20120202'
set @type = 'HOMEPHONE'

select
    p.name, vc.value
from
    @tPeoples p
    left join 
        (select
            ROW_NUMBER () over (partition by peopleId order by starting) as rn, *
        from
            @tPeopelContact pc
            join @tContacts c on pc.contactId = c.id
        where 
            type = @type and
            starting < @currentdate and ( ending is null or ending >= @currentdate) 
        ) vc on rn = 1 and vc.peopleId = p.id
tschmit007
  • 7,559
  • 2
  • 35
  • 43
  • What is the advantage of this approach over the OUTER APPLY solution, or the CTE suggested by soupy1976? – Nick Silberstein Jul 06 '12 at 23:04
  • for the cte, no more thant yet said by @soupy1976. They are not my first thought because : no recursion in htis case and a too long time working on platforms that do not support it. For the outer apply, only an execution plan should say. My feeling is that outer apply make me think to an interative (line by line) process, and we are in a set (of lines by batch) world. – tschmit007 Jul 09 '12 at 08:06
0

The following query seems correct :

SELECT A.FullName, B.ContactValue
FROM People A
LEFT JOIN ( SELECT  X.PersonId,
                    Y.ContactValue,
                    ROW_NUMBER() OVER (PARTITION BY X.PersonId ORDER BY ContactId ) AS RowNumber
            FROM PeopleContacts X
            LEFT JOIN Contacts Y ON Y.ContactId = X.ContactId 
        ) B ON B.PersonId = A.PersonId AND B.RowNumber = 1
mehdi lotfi
  • 11,194
  • 18
  • 82
  • 128