3

For a database with multiple rows per e-mail address, I want to group by each email address, taking the "most recent" information for each e-mail address.

Email      Col1    Col2    Col3   Col4    CustomerID
=======    ====    ====    ====   ====    ==========
a@a.com    a       a       a      null    1
a@a.com    null    b       b      null    2
a@a.com    null    null    c      null    3

I want to take the non-null value with the highest CustomerID. For above, I'd expect:

Email      Col1    Col2    Col3   Col4
=======    ====    ====    ====   ====
a@a.com    a       b       c      null

I can do a GROUP BY, taking the MAX for each column, but it's simply the highest value alphabetically, and doesn't take CustomerID into consideration.

SQL Fiddle

SELECT EmailAddress, MAX(FirstName), MAX(LastName), MAX(Gender), MAX(Birthday), MAX(Country)
FROM CustomerInfo
GROUP BY EmailAddress

Additionally, this is being programmed in Exact Target, meaning some SQL keywords are unsupported, most notably variables, temp tables, and cursors are not supported.

Given these restrictions, is it possible to get the desired results?

Community
  • 1
  • 1
Will Eddins
  • 13,628
  • 5
  • 51
  • 85
  • I spent a longer amount of time trying to figure out a title than writing the question. Feel free to change the title if you have a better way to summarize the problem. – Will Eddins May 12 '14 at 18:37
  • How can you use `MAX` on a string type value? – Rahul May 12 '14 at 18:39
  • @Rahul It just gets the largest value, alphabetically. Regardless, it's not what I want, since I want it to just be the non-null value with the greatest CustomerID. – Will Eddins May 12 '14 at 18:40
  • Does it have to be null?How about empty? – Mihai May 12 '14 at 18:43
  • @Mihai If there's no value, it should be null, since an empty string could theoretically be a valid value. – Will Eddins May 12 '14 at 18:44
  • Are you trying to avoid grouping by a couple columns at a time (Email, Col1, MAX(CustomerID)) and then repeating that because it's unwieldy for more than a couple of columns? I mean, it would work, but you're doing a separate sub-select for every column. – Danny May 12 '14 at 18:49
  • @Danny - A seperate sub-select is fine, performance is not a large concern here because of all the constraints exact target has. I'm just not sure how I'd write that. – Will Eddins May 12 '14 at 18:53

1 Answers1

3

If I'm understanding your question correctly, I think you'll need to join the table to itself multiple times. Something like this should work using a common table expression to get the max customer id per each column where the column isn't null. Then it joins back to itself to get the value:

with cte as (
  select email, 
      max(case when col1 is not null then customerid end) maxcustomerid1,
      max(case when col2 is not null then customerid end) maxcustomerid2,
      max(case when col3 is not null then customerid end) maxcustomerid3,
      max(case when col4 is not null then customerid end) maxcustomerid4
    from yourtable
    group by email
)
select t.email,
  t1.col1,
  t2.col2, 
  t3.col3,
  t4.col4
from cte t
  left join yourtable t1 on t.email = t1.email and t.maxcustomerid1 = t1.customerid
  left join yourtable t2 on t.email = t2.email and t.maxcustomerid2 = t2.customerid
  left join yourtable t3 on t.email = t3.email and t.maxcustomerid3 = t3.customerid
  left join yourtable t4 on t.email = t4.email and t.maxcustomerid4 = t4.customerid
sgeddes
  • 62,311
  • 6
  • 61
  • 83
  • It seems like this would work if Exact Target supported Common Table Expressions. Unfortunately, it doesn't: http://help.exacttarget.com/en-US/documentation/exacttarget/interactions/activities/query_activity/#section_6 – Will Eddins May 12 '14 at 18:54
  • @WillEddins -- you don't need to use the CTE, just use it as a subquery instead. Updated Fiddle: http://sqlfiddle.com/#!3/35b27/8 – sgeddes May 12 '14 at 18:56