3

I am trying to capture the table names starting with some patterns and find the total counts for each group but i would specifically like to capture ,below is an example

I think the best possible way to capture the above is using REGEXP ?

Query 1 - Initial

SELECT
   owner AS schema_name, 
   object_name, 
   object_type,
   REGEXP_REPLACE(OBJECT_NAME, '^([A-Z0-9$]{1,})_.*', '\1') as BEGINNING,
   count(*),
   round(100*ratio_to_report(count(*)) over (), 4) percentage 
FROM
   dba_objects 
GROUP BY
   owner,
   object_name,
   object_type,
   REGEXP_REPLACE(OBJECT_NAME, '^([A-Z0-9$]{1,})_.*', '\1') 
ORDER BY
   percentage desc;

Query 1 - existing result

SCHEMA OBJECT_NAME       OBJECT_TYPE  BEGINNING COUNT(*) PERCENT
STG    AB_01_CUST_ENRLMT TABLE          AB          1    .00001
STG    DKS_SD_PRDCT_DHSS TABLE          DKS         1    .00001
STG    ABC10_CUST_ENRLMT  TABLE         ABC10       1    .00001

Query 1 - expected results

 SNO SCHEMA OBJECT_NAME       OBJECT_TYPE  BEGINNING             COUNT(*) PERCENT
  1  STG    AB_01_CUST_ENRLMT TABLE        AB_01                 1       .00001
  2  STG    DKS_SD_PRDCT_DHSS TABLE        DKS_SD                1       .00001
  3  STG    ABC10_CUST_ENRLMT TABLE        ABC10_CUST            1       .00001
  4  STG    #Tableau_6_sid:15 TABLE        #Tableau_6            4000    1.5
  5  STG    /157d_PoI_12 TABLE             /157d_PoI             5000    1.6
  6  STG    JAVA/157d_Ph TABLE             JAVA/157_Ph           5000    1.6
  7  STG    STU$BA_COENT_123 TABLE        STU$BA_COENT           5000    1.5

how to achieve as per expected results

After suggestions i have incorporated the below logic and re ran , unfortunately i am not getting the desired result

Query 2 - Modified

SELECT
   owner AS schema_name,
   object_name,
   object_type,
   REGEXP_REPLACE(OBJECT_NAME, '^([A-Z0-9$]{1,})_([A-Z0-9$]{1,})_.*', '\1_\2')as BEGINNING,
   count(*),
   round(100*ratio_to_report(count(*)) over (), 4) percentage 
FROM
   dba_objects 
GROUP BY
   owner,
   object_name,
   object_type,
   REGEXP_REPLACE(OBJECT_NAME, '^([A-Z0-9$]{1,})_.([A-Z0-9$]{1,})_.*', '\1_\2')
ORDER BY
   percentage desc;

Query 2 - Results - Not as expected

 SNO SCHEMA OBJECT_NAME       OBJECT_TYPE  BEGINNING           COUNT(*) PERCENT
 1   STG    AB_01_CUST_ENRLMT TABLE        AB_1                  1      .00001
 2   STG    DKS_SD_PRDCT_DHSS TABLE        DKS_D                 1      .00001
 3   STG    ABC_25_ENRLMT     TABLE        ABC_5                 1      .00001
  4  STG    #Tableau_6_sid:15 TABLE        #Tableau_6            4000    1.5
  5  STG    /157d_PoI_12 TABLE             /157d_PoI             5000    1.6
  6  STG    JAVA/157d_Ph TABLE             JAVA/157_Ph           5000    1.6
  7  STG    STU$BA_COENT_123 TABLE        STU$BA_COENT           5000    1.5
 

what is that i need to correct in the REGEX logic ? , i have come across some special cases

Query 2 - Results - Not as expected

Data2explore
  • 452
  • 6
  • 16

2 Answers2

0

Your regex is fetching the first occurrence of (char digit) pattern. That is; ([A-Z0-9$]{1,})

The (...) bracets are used for grouping and in the third parameter of the REGEXP_REPLACE you are using this one group as replacement of the entire original string. There are other ways to solve this, but if you wish to continue using the same approach, you can extend your regex to the following;

REGEXP_REPLACE(OBJECT_NAME, '^([A-Z0-9$]{1,})_.([A-Z0-9$]{1,})_.*', '\1_\2')

Edited regex;

REGEXP_REPLACE(OBJECT_NAME, '^([A-Z0-9$]{1,})_([A-Z0-9$]{1,})_.*', '\1_\2')

This should replace the string with the (first_group)_(second_group) which is mentioned as \1_\2 in the REGEXP_REPLACE parameter.

Shivam Puri
  • 1,578
  • 12
  • 25
  • Man thanks for the explanation , unfortunately i have used the above logix but not getting the desired result , i am getting 1) DKS_D – Data2explore Nov 20 '20 at 22:23
  • Can you please check my query which i have tested , is something i am missing in the regex ? – Data2explore Nov 21 '20 at 17:29
  • I edited my query, there should not be a dot between the 2 bracket groups...```(...)_.(...)``` should be ```(...)_(...)``` – Shivam Puri Nov 21 '20 at 18:48
  • Many thanks for the logic , while i am testing your logic , i have come across few cases , unfortunately #Tableau_6_sid:15251a_4i_2a, /15722c29d_PhotoImageCameraPro and JAVA/15722c29d_PhotoImageqeeee ,how to handle theses cases ? – Data2explore Nov 21 '20 at 21:38
  • You did not change query 2 yet. Please try again with my edited regex. Remove the dot. After that let me know which case works and which case doesn't work – Shivam Puri Nov 22 '20 at 17:13
0

REGEXP_REPLACE() might be applied with two kind of patterns depending on the number of underscores whether less than or equal to two or more :

SELECT object_name,
       CASE 
       WHEN REGEXP_COUNT(object_name,'_') <= 2 THEN
            REGEXP_REPLACE(object_name,'(*.)(_.*){1,}(_.*)$', '\1\2')
       ELSE REGEXP_REPLACE(object_name,'^([[:alnum:]]{1,})_([[:alnum:]]{1,}).*', '\1_\2')     
        END AS new_object_name
  FROM user_objects

Demo

Barbaros Özhan
  • 59,113
  • 10
  • 31
  • 55
  • Many thanks for the logic , while i am testing your logic , i have come across few cases , unfortunately #Tableau_6_sid:15251a_4i_2a, /15722c29d_PhotoImageCameraPro and JAVA/15722c29d_PhotoImageqeeee ,how to handle theses cases ? – Data2explore Nov 21 '20 at 21:37
  • you're welcome @rakesh . What do you expect to return from newly inserted records `#Tableau_6_sid:15251a_4i_2a`, `/15722c29d_PhotoImageCameraPro`, `JAVA/15722c29d_PhotoImageqeeee` ..? You can edit the `expected results result` part with this info. – Barbaros Özhan Nov 22 '20 at 12:26
  • ,please check now i have updated the expected results section – Data2explore Nov 22 '20 at 13:31
  • i have tested still i am encountering new cases , please check the cases above – Data2explore Nov 22 '20 at 16:15
  • What you newly added(`STUD$BANK_CONTENT` or `STU$BA_COENT_123`) have no problem due to your desired style. e.g. already conforms to the pattern provided @rakesh – Barbaros Özhan Nov 22 '20 at 16:49
  • Many thanks i have a short question is it possible to include the case statement regex in group by , i have tried but i was not able to execute the query – Data2explore Nov 22 '20 at 21:04
  • Hi @rakesh . Check [this](https://dbfiddle.uk/?rdbms=oracle_11.2&fiddle=5d082d8dbe34d315cc39a5bfc99ba518) out. – Barbaros Özhan Nov 22 '20 at 21:20