I am running into performance issue when I use ORACLE CONNECT BY LEVEL due to duplicate records. My source table has around 50k records, in which one column (code in the example) can have multiple values separated by semicolon. Finally I should get an out with create one record per each unique value (code).
Source data:
CREATE TABLE TEST
(name varchar2(255),
code varchar2(255)
);
insert into test values ('ABC-Amherst','1179;348;403;407;408;412;413;426;428;430');
insert into test values ('ABC-Amherst','1179;352;401;403;412;413;416;420;433;A40');
insert into test values ('ABC-Ville','1179;352;401;403;412;413;416;420;433;A40');
commit;
Expected Output:
Name Code
ABC-Amherst 348
ABC-Amherst 408
ABC-Amherst 430
ABC-Amherst 428
ABC-Amherst 426
ABC-Amherst 433
ABC-Amherst A40
ABC-Amherst 407
ABC-Amherst 352
ABC-Amherst 401
ABC-Amherst 413
ABC-Amherst 403
ABC-Amherst 420
ABC-Amherst 416
ABC-Amherst 1179
ABC-Amherst 412
ABC-Ville 412
ABC-Ville 352
ABC-Ville A40
ABC-Ville 420
ABC-Ville 416
ABC-Ville 401
ABC-Ville 1179
ABC-Ville 433
ABC-Ville 403
ABC-Ville 413
Query used - Below query is returning duplicates. When I run it on my full dataset it runs forever and finally freezes. I tried adding distinct/row number but they are also running forever. Is there a way to prevent the duplicates or any other efficient ways to get the expected output?
SELECT name,
trim(regexp_substr(code,'(;|^)([^;]+)',1,level,NULL,2) ) AS code
FROM test
CONNECT BY prior name=name
AND level <= 1 + regexp_count(code,';')
AND prior sys_guid() IS NOT NULL;