0

I have a table (Table1) like the following:

Col1 Col2
First Code1,Code2,Code3
Second Code2

So Col2 can contain multiple values comma separated, I have another table (Table2) that contains this:

ColA ColB
Code1 Value1
Code2 Vaue2
Code3 Vaue3

I need to create a view that joins the two tables (Table1 and Table2) and returns something like this:

Col1 Col2
First Value1,Value2,Value3
Second Value2

Is that possible? (I'm on Oracle DB if that helps.)

navy1978
  • 1,411
  • 1
  • 15
  • 35
  • you really should read https://stackoverflow.com/questions/3653462/is-storing-a-delimited-list-in-a-database-column-really-that-bad – nbk Feb 06 '23 at 19:45
  • I forgot to mention table1 and table2 are not managed by me – navy1978 Feb 06 '23 at 19:46

2 Answers2

1

It's a violation of first normal form to have a list in a column value like that. It causes a lot of difficulties in a relational database, like the one you are encountering now.

However, you can get what you want by using the LIKE operator to find colA values that are substrings of the Col2 column. Add delimiters before and after to catch the first and last ones. Then aggregate back up to a single list using LISTAGG.

SELECT table1.col1,
       LISTAGG(table2.colB,',') WITHIN GROUP (ORDER BY table2.colB) value_list
  FROM table1,
       table2
 WHERE ','||table1.col2||',' LIKE '%,'||table2.colA||',%'
 GROUP BY table1.col1

This will not perform well on large volumes, because without an equijoin it's going to use nested loops, and you can't use an index on a LIKE predicate with % at the beginning. The combination of nested loops + FTS is not pleasant with large volumes of data. Therefore, if this is your situation, you will need to fix the 1NF problem by transforming table1 into normal relational format, and then join it to table2 with an equijoin, which will enable it to use a hash join instead. So:

SELECT table1.col1,
       LISTAGG(table2.colB,',') WITHIN GROUP (ORDER BY table2.colB) value_list
  FROM (SELECT t.col1,
               SUBSTR(t.col2,INSTR(t.col2,',',1,seq)+1,INSTR(t.col2,',',1,seq+1)-(INSTR(t.col2,',',1,seq)+1)) col2_piece
          FROM (SELECT col1,
                       ','||col2||',' col2
                  FROM table1) t,
               (SELECT ROWNUM seq FROM dual CONNECT BY LEVEL < 10) x) table1,
       table2
 WHERE table1.col2_piece IS NOT NULL
   AND table1.col2_piece = table2.colA
 GROUP BY table1.col1
Paul W
  • 5,507
  • 2
  • 2
  • 13
  • Yes I know table1 violates the 1 normal form but unfortunately is not managed by me. Thanks for your reply I’m going to try in a while! – navy1978 Feb 06 '23 at 19:48
  • @PaulW: Out of curiosity, why did not you use the standard JOIN syntax? – PM 77-1 Feb 06 '23 at 20:04
  • It works , I have accepted your Answer but it's very very very slow :( – navy1978 Feb 06 '23 at 20:05
  • Make sure you get hash joins + full table scans, and if you have PQ enabled use PQ. For example: SELECT /*+ USE_HASH(table1 table2) FULL(table1) FULL(table2) PARALLEL(4) */ table1.col1, . . . – Paul W Feb 06 '23 at 20:16
  • @PM because I've been writing Oracle code for nearly 3 decades and have no pressing reason to change to the ANSI syntax unless there's some specific functionality I need that it alone provides, which occasionally happens. – Paul W Feb 06 '23 at 20:17
  • @PaulW I've been writing Oracle code for over 3 decades and I switched to ANSI syntax 20 years ago ;) – William Robertson Feb 06 '23 at 22:55
  • Actually I think it's because native Oracle syntax is more concise. It takes a lot more typing to say LEFT OUTER JOIN ON .... than it does to type a single comma. I wrote so much SQL that that conciseness is worth being slightly archaic. Also there are times when the optimizer refuses to obey hints when using ANSI syntax, and only rewriting using Oracle native syntax will get it to heed the hint properly. So, just preference and experience, but everyone's got their own way. – Paul W Feb 07 '23 at 01:21
  • 1
    @navy1978, I edited my answer to deal with your reported slowness. I realized this morning that the hints I gave you wouldn't work without an equijoin, and that you must not have another column you are joining on. – Paul W Feb 07 '23 at 11:18
0

If you want the values in the same order in the list as the terms then you can use:

SELECT t1.col1,
       LISTAGG(t2.colb, ',') WITHIN GROUP (
         ORDER BY INSTR(','||t1.col2||',', ','||t2.colA||',')
       ) AS value2
FROM   table1 t1
       INNER JOIN table2 t2
       ON INSTR(','||t1.col2||',', ','||t2.colA||',') > 0
GROUP BY 
       t1.col1

Which, for the sample data:

CREATE TABLE Table1 (Col1, Col2) AS
SELECT 'First',  'Code1,Code2,Code3' FROM DUAL UNION ALL
SELECT 'Second', 'Code2' FROM DUAL;

CREATE TABLE Table2 (ColA, ColB) AS
SELECT 'Code1', 'XXXX' FROM DUAL UNION ALL
SELECT 'Code2', 'ZZZZ' FROM DUAL UNION ALL
SELECT 'Code3', 'YYYY' FROM DUAL;

Outputs:

COL1 VALUE2
First XXXX,ZZZZ,YYYY
Second ZZZZ

fiddle

MT0
  • 143,790
  • 11
  • 59
  • 117