0

I have a query which is taking hours to execute and sometimes it does not even execute. The query is as follows:

SELECT id, trim(regexp_substr(str, '[^,]+', 1, LEVEL)) str
FROM (SELECT id , MULTILIST01 as str from PAGE_TWO where MULTILIST01 is not null )
WHERE trim(regexp_substr(str, '[^,]+', 1, LEVEL)) is not null
CONNECT BY instr(str, ',', 1, LEVEL -1) > 0;

Result set of the query

SELECT id , MULTILIST01 as str from PAGE_TWO where MULTILIST01 is not null

is as follow:

ID       MULTILIST01 
295285  ,3434925,3434442,3436781,
212117  ,3434925,3434442,3436781,
212120  ,3434925,3434442,3436781,
6031650 ,3436781,
.
.
.

In the outer query I am trying to make every comma separated value as a unique value. When I execute the outer query, it takes hours to execute. I have tried optimizing it, but it was of no use.

Any idea how can I optimize it.

Oracle Version info

Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
PL/SQL Release 12.1.0.2.0 - Production
CORE    12.1.0.2.0  Production
TNS for 64-bit Windows: Version 12.1.0.2.0 - Production
NLSRTL Version 12.1.0.2.0 - Production

Explain table info

Plan hash value: 4097679000

------------------------------------------------------------------------------------------
| Id  | Operation                     | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |          |  1429 | 11432 |   840   (2)| 00:00:01 |
|*  1 |  FILTER                       |          |       |       |            |          |
|*  2 |   CONNECT BY WITHOUT FILTERING|          |       |       |            |          |
|*  3 |    TABLE ACCESS FULL          | PAGE_TWO |  1429 | 11432 |   840   (2)| 00:00:01 |
------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SEL$F5BB74E1
   3 - SEL$F5BB74E1 / PAGE_TWO@SEL$2

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(TRIM( REGEXP_SUBSTR ("MULTILIST01",'[^,]+',1,LEVEL)) IS NOT NULL)
   2 - filter(INSTR("MULTILIST01",',',1,LEVEL-1)>0)
   3 - filter("MULTILIST01" IS NOT NULL)

Column Projection Information (identified by operation id):
-----------------------------------------------------------

   1 - "ID"[NUMBER,22], "MULTILIST01"[VARCHAR2,1020], LEVEL[4]
   2 - "ID"[NUMBER,22], "MULTILIST01"[VARCHAR2,1020], LEVEL[4]
   3 - "ID"[NUMBER,22], "MULTILIST01"[VARCHAR2,1020]

Table contains 225 columns where index is only on primary key columns (ID, CLASS).

This table is of Agile PLM.

Burhan Khalid Butt
  • 275
  • 1
  • 7
  • 20

1 Answers1

2

Your approach works for a table with one line only.

SELECT id, trim(regexp_substr(str, '[^,]+', 1, LEVEL)) str
FROM (SELECT id , MULTILIST01 as str from PAGE_TWO where MULTILIST01 is not null and rownum <= 1 )
WHERE trim(regexp_substr(str, '[^,]+', 1, LEVEL)) is not null
CONNECT BY instr(str, ',', 1, LEVEL -1) > 0
order by 1,2;

        ID STR                     
---------- -------------------------
    295285 3434442                   
    295285 3434925                   
    295285 3436781  

Starting with two rows table you get (probably) a lot more results that expected:

SELECT id, trim(regexp_substr(str, '[^,]+', 1, LEVEL)) str
FROM (SELECT id , MULTILIST01 as str from PAGE_TWO where MULTILIST01 is not null and rownum <= 2 )
WHERE trim(regexp_substr(str, '[^,]+', 1, LEVEL)) is not null
CONNECT BY instr(str, ',', 1, LEVEL -1) > 0
order by 1,2;   

        ID STR                     
---------- -------------------------
    212117 3434442                   
    212117 3434442                   
    212117 3434925                   
    212117 3436781                   
    212117 3436781                   
    212117 3436781                   
    212117 3436781                   
    295285 3434442                   
    295285 3434442                   
    295285 3434925                   
    295285 3436781                   
    295285 3436781                   
    295285 3436781                   
    295285 3436781        
;

This re-formulation of the query will get what you (probalbly) wants. Use a subquery which delivers the index of the substring (1 ..N). You must define the maximum number of the substring that will be splitted. Join this table with you table to effectively multiply the rows times N.

with substr_idx as (
select  rownum colnum from dual connect by level <= 3 /*  max  number of substrings */)   
SELECT id, trim(regexp_substr(str, '[^,]+', 1, colnum)) str
FROM (SELECT id , MULTILIST01 as str from PAGE_TWO where MULTILIST01 is not null), substr_idx
WHERE trim(regexp_substr(str, '[^,]+', 1, colnum)) is not null
order by 1,2;  

        ID STR                     
---------- -------------------------
    212117 3434442                   
    212117 3434925                   
    212117 3436781                   
    212120 3434442                   
    212120 3434925                   
    212120 3436781                   
    295285 3434442                   
    295285 3434925                   
    295285 3436781                   
   6031650 3436781 

Further (minor) performance improvement is expected if you replace the regexp with the substr / instr extraction. See e.g. here

One moral of this story, if you don't get results with big data try with small data (check the rownum <= 2 limitation above) and see if the results are as expected.

Marmite Bomber
  • 19,886
  • 4
  • 26
  • 53