0

Table data set:


SSN        ID  F_NAME  L_NAME
000000000  1   JANE    DOE
123456789  2   PETE    PETER
123456789  3   JOHN    SHAW
234567890  4   ALAN    DAVID
234567890  5   MAY     APRI
345677891  6   JUNE    ALYSSA
456789123  7   LISA    LINH
567891234  8   HAL     LEY
567891234  9   DEREK   LI
567891234  10  JIM     JAMES

From the data set above, my intention is to identify IDs/demographics with same SSNs and list horizontally all the names and the results as below. We are also interested in the count of unique IDs associated with each SSN.


SSN       COUNT ID  F_NAME  L_NAME ID_2 F_NAME_2 L_NAME_2 ID_3 F_NAME_3 L_NAME_3
123456789 2     2   PETE    PETER  3    JOHN     SHAW
234567889 2     4   ALAN    DAVID  5    MAY      APRI
567891234 3     8   HAL     LEY    9    DEREK    LI       10   JIM      JAMES    

Help?

Dave Jabbz
  • 109
  • 7

2 Answers2

1

I'd use listagg in the following fashion, and you can add the unique ID as well:

create table demographics(
ssn number(9),
id number(12),
f_name varchar2(50),
l_name varchar2(50));


insert into demographics values(000000000,  1,   'JANE',    'DOE');
insert into demographics values(123456789,  2,   'PETE',    'PETER');
insert into demographics values(123456789,  3,   'JOHN',    'SHAW');
insert into demographics values(234567890,  4,   'ALAN',    'DAVID');
insert into demographics values(234567890,  5,   'MAY',     'APRI');
insert into demographics values(345677891,  6,   'JUNE',    'ALYSSA');
insert into demographics values(456789123,  7,   'LISA',    'LINH');
insert into demographics values(567891234,  8,   'HAL',     'LEY');
insert into demographics values(567891234,  9,   'DEREK',   'LI');
insert into demographics values(567891234,  10,  'JIM',     'JAMES');

SELECT ssn, count(ssn) as count, LISTAGG(f_name || ' ' || l_name, ', ') WITHIN GROUP (ORDER BY f_name || ' ' || l_name) "Names"
FROM demographics
group by ssn;



SSN COUNT   Names
0   1   JANE DOE
123456789   2   JOHN SHAW, PETE PETER
234567890   2   ALAN DAVID, MAY APRI
345677891   1   JUNE ALYSSA
456789123   1   LISA LINH
567891234   3   DEREK LI, HAL LEY, JIM JAMES
Zynon Putney II
  • 675
  • 4
  • 12
  • thanks! as i have a large data set, this does not seem to work also due to the 'ORA-01489: result of string concatenation is too long'. Kindly have a similar implementation with the PIVOT? – Dave Jabbz May 17 '18 at 12:56
1

See more information on pivot here as per @lad2025's advice

Depending on what format is allowable to you, a quick solution I would use would be the listagg function in SQL (Oracle) which is similar to MySQL's group_concat such as

SELECT SSN,
       count(*),
       LISTAGG(ID, ', ') WITHIN GROUP (ORDER BY ID),
       LISTAGG(F_NAME, ', ') WITHIN GROUP (ORDER BY F_NAME)
FROM table GROUP BY SSN HAVING count(*) > 1;

would result in

123456789 2     2,3      PETE, JOHN    
234567889 2     4,5      ALAN, MAY   
567891234 3     8,9,10   HAL, DEREK, JIM

Alternatively, you could identify all rows with duplicate SSNs - this would avoid using pivot and keep your result in the same relational format

SELECT *
FROM table
WHERE SSN in (SELECT SSN from table group by SSN having count(*) > 1);

would return all rows with duplicate SSNs:

SSN        ID  F_NAME  L_NAME
123456789  2   PETE    PETER
123456789  3   JOHN    SHAW
234567890  4   ALAN    DAVID
234567890  5   MAY     APRI
567891234  8   HAL     LEY
567891234  9   DEREK   LI
567891234  10  JIM     JAMES
cacti5
  • 2,006
  • 2
  • 25
  • 33
  • thanks @Anna. I get a 'ORA-01489: result of string concatenation is too long' with this implementation. Seems while it may work under certain conditions, there's the limit – Dave Jabbz May 17 '18 at 12:48
  • @DaveJabbz That is because you probably have duplicate SSNs with many people associated with them. I added an alternative to my answer. – cacti5 May 17 '18 at 16:04