I have a table that contains customers information. Each customer is assigned a Customer ID (their SSN) that they retain as they open more accounts. Two customers may be on the same account, each with their own ID. The account numbers are not ordered by date.
I would like to find the most recent account of each customer or group of customers. If two customers have ever been on an account together, I want to return the most recent account either customer has been on.
Here is a sample table with some of the possible cases.
Example table ACCT:
acctnumber date Cust1ID Cust2ID
10000 '2016-02-01' 1110 NULL --Case0-customer has only ever had
--one account
10001 '2016-02-01' 1111 NULL --Case1-one customer has multiple
10050 '2017-02-01' 1111 NULL --accounts
400050 '2017-06-01' 1111 NULL
10089 '2017-12-08' 1111 NULL
10008 '2016-02-01' 1120 NULL --Case2-customer has account(s) and later
10038 '2016-04-01' 1120 NULL
10058 '2017-02-03' 1120 1121 --gets account(s) with another customer
10002 '2016-02-01' 1112 NULL --Case3-customer has account(s) and later
10052 '2017-02-02' 1113 1112 --becomes the second customer on another
10152 '2017-05-02' 1113 1112 --account(s)
10003 '2016-02-02' 1114 1115 --Case4-customer and second customer
7060 '2017-02-04' 1115 1114 --switch which is first and second
10004 '2016-02-02' 1116 1117 --Case5-second customer later gets
10067 '2017-02-05' 1117 NULL --separate account(s)
10167 '2018-02-05' 1117 NULL
50013 '2016-01-01' 2008 NULL --Case5b -customer has account(s) & later
50014 '2017-02-02' 2008 2009 --gets account(s) with second customer &
50015 '2017-04-04' 2008 NULL --later still first customer gets
100015 '2018-05-05' 2008 NULL --separate account(s)
30005 '2015-02-01' 1118 NULL --Case6-customer has account(s)
10005 '2016-02-01' 1118 NULL
10054 '2017-02-02' 1118 1119 --gets account(s) with another
40055 '2017-03-03' 1118 1119
10101 '2017-04-04' 1119 NULL --who later gets separate account(s)
10201 '2017-05-05' 1119 NULL
30301 '2017-06-06' 1119 NULL
10322 '2018-01-01' 1119 NULL
10007 '2016-02-01' 1122 1123 --Case7-customers play musical chairs
10057 '2017-02-03' 1123 1124
10107 '2017-06-02' 1124 1125
50001 '2016-01-01' 2001 NULL --Case8a-customers with account(s)
50002 '2017-02-02' 2001 2002 --together each later get separate
50003 '2017-03-03' 2001 NULL --account(s)
50004 '2017-04-04' 2002 NULL
50005 '2016-01-01' 2003 NULL --Case8b-customers with account(s)
50006 '2017-02-02' 2003 2004 --together each later get separate
50007 '2017-03-03' 2004 NULL --account(s)
50008 '2017-04-04' 2003 NULL
50017 '2018-03-03' 2004 NULL
50018 '2018-04-04' 2003 NULL
50009 '2016-01-01' 2005 NULL --Case9a-customer has account(s) & later
50010 '2017-02-02' 2005 2006 --gets account(s) with a second customer
50011 '2017-03-03' 2005 2007 --& later still gets account(s) with a
--third customer
50109 '2016-01-01' 2015 NULL --Case9b starts the same as Case9a, but
50110 '2017-02-02' 2015 2016
50111 '2017-03-03' 2015 2017
50112 '2017-04-04' 2015 NULL --after all accounts with other customers
50122 '2017-05-05' 2015 NULL --are complete, the original primary
--customer begins opening individual
--accounts again
Desired Results:
acctnumber date Cust1ID Cust2ID
10000 '2016-02-01' 1110 NULL --Case0
10089 '2017-12-08' 1111 NULL --Case1
10058 '2017-02-03' 1120 1121 --Case2
10152 '2017-05-02' 1113 1112 --Case3
7060 '2017-02-04' 1115 1114 --Case4
10167 '2018-02-05' 1117 NULL --Case5
100015 '2018-05-05' 2008 NULL --Case5b
10322 '2018-01-01' 1119 NULL --Case6
10107 '2017-06-02' 1124 1125 --Case7
50003 '2017-03-03' 2001 NULL --Case8a result 1
50004 '2017-04-04' 2002 NULL --Case8a result 2
50017 '2018-03-03' 2004 NULL --Case8b result 1
50018 '2018-04-04' 2003 NULL --Case8b result 2
50011 '2017-03-03' 2005 2007 --Case9a
50122 '2017-05-05' 2015 NULL --Case9b
Alternatively, I would accept Case 7 outputting the two separate customer groups:
10007 '2016-02-01' 1122 1123 --Case7 result 1
10107 '2017-06-02' 1124 1125 --Case7 result 2
Because Cases 8a & 8b would represent the company acknowledging the customers are worthy of holding separate accounts, we would want to then consider their group as splitting, so it has separate sets of results.
Also, in most scenarios the customers have many accounts, and mix and matching the above cases overtime is common. For example, a single customer can have five accounts (Case 1), then later opens one or more accounts with another customer (Case 3) sometimes switching the primary account holder (Case 4) then afterwards the first customer begins opening individual accounts again (Case 5b).
I have attempted joining the table to a copy of itself whenever acctnumbers are unique and any of the Cust IDs match. However, this removes customers who have only had one account so I added a union of cust that have no matches on the custid or account number and groups by custid.
Unfortunately, the second piece does not only include custids from case 0 and there are some custids which are excluded all together that shouldn't be.
select
max(date1) as date,
cust1id1 as cust1id
from
(
select
acctnumber as [acctnumber1],
date as [date1],
cust1id as [cust1id1],
cust2id as [cust2id1]
from
acct
) t1
join
(
select
acctnumber as [acctnumber2],
date as [date2],
cust1id as [cust1id2],
cust2id as [cust2id2]
from
acct
) t2
on t1.date1 > t2.date2 and
(t1.cust1id1 = t2.cust1id2 or
t1.cust1id1 = t2.cust2id2 or
t1.cust2id1 = t2.cust2id2)
Group by
cust1id1
union
select
max(date1) as date,
cust1id1 as cust1id
from
(
select
acctnumber as [acctnumber1],
date as [date1],
cust1id as [cust1id1],
cust2id as [cust2id1]
from
acct
) t1
join
(
select
acctnumber as [acctnumber2],
date as [date2],
cust1id as [cust1id2],
cust2id as [cust2id2]
from
acct
) t2
on (t1.acctnumber1 != t2.acctnumber2 and
t1.cust1id1 != t2.cust1id2 and
t1.cust1id1 != t2.cust2id2 and
t1.cust2id1 != t2.cust2id2)
group by
cust1id1
Update
Thank you for all the great answers and comments so far. I have been trying out the queries and comparing results.
@VladimirBaranov has brought up a rare case that I had not previously considered in comments to other answers.
Similarly to case 7, it will be a bonus if Case8 is handled, but not expected.
Case 9 is important and the result for 9a and 9b should be handled.
Update 2
I noticed issues with my original set of 7 cases.
In more recent accounts, when a customer is no longer on the account, it was always the second borrower that remained. This was entirely unintentional, you can look at any of those examples and either customer can potentially be the remaining customer on the most recent account.
Also, each case had the minimum number of accounts to display exactly what the case was testing, but this is not common. Usually in each step of each case there can be 5, 10, 15 or more accounts before a customer switches to adding on a second customer, and those two can then have many accounts together.
Reviewing the answers I see many have index, create, update and other clauses specific to being able to edit the database. Unfortunately, I am on the consumer side of this database so I have read only access, and the program I can use to interact with the database automatically rejects them.