0

2 sample's of code:

      WITH A AS
  (SELECT 'this is a test:12/01/2015  01/05/2018'
    || chr(13)
    ||chr(10)
    || ' this is the 2nd line: 07/07/2017' Description
  FROM dual
  )
SELECT to_date(regexp_substr(A.Description , '\d{1,2}/\d{1,2}/\d{4}',1,level),'MM/DD/YYYY')
FROM A
  CONNECT BY level  <= regexp_count(a.description, '\d{1,2}/\d{1,2}/\d{4}');
/

and another:

SELECT DISTINCT REGEXP_SUBSTR ('23,34,45,56','[^,]+',1,LEVEL) as "token"
FROM   DUAL
CONNECT BY REGEXP_SUBSTR ('23,34,45,56','[^,]+',1,LEVEL) IS NOT NULL
order by 1;

Can someone explain me, why in 1st case we use '<=' and in 2nd we don't?

Michał M
  • 618
  • 5
  • 13
  • In the first example you can also use `IS NOT NULL` ==> just `CONNECT BY regexp_substr(a.description, '\d{1,2}/\d{1,2}/\d{4}', 1, level) is not null`. And in the second example yu can use `CONNECT BY level <= REGEXP_count ('23,34,45,56','[\,]+') + 1` instead `IS NOT NULL`. It hard to say why some has choosen `is not null` instead of `regexp_count`, it could be simply a personal preference, In my opinion. – krokodilko Jun 21 '16 at 11:13
  • Avoid using the regex in that second select for parsing delimited lists. It does not handle NULL list elements and will return incorrect values silently. See here for more information: http://stackoverflow.com/questions/31464275/split-comma-separated-values-to-columns-in-oracle/31464699#31464699 – Gary_W Jun 21 '16 at 15:49

2 Answers2

0
Introduced in Oracle 10g 

REGEXP_SUBSTR - Returns the string matching the regular expression.

Introduced in Oracle 11g 

REGEXP_COUNT - Returns the number of occurrences of the regular expression in the string.

The same you can achieve by without using relational operator '<=' . In your above code you are counting the occurrence .

WITH A AS
      (SELECT 'this is a test:12/01/2015  01/05/2018'
        || chr(13)
        ||chr(10)
        || ' this is the 2nd line: 07/07/2017' Description
      FROM dual
      )
    SELECT to_date(regexp_substr(A.Description , '\d{1,2}/\d{1,2}/\d{4}',1,level),'MM/DD/YYYY')
    FROM A
      CONNECT BY   
      regexp_substr(A.Description , '\d{1,2}/\d{1,2}/\d{4}',1,LEVEL) is not null;

In your second query you can make use of '<=' as per the need.

SELECT DISTINCT REGEXP_SUBSTR ('23,34,45,56','[^,]+',1,LEVEL) as "token",LEVEL
        FROM   DUAL
    CONNECT BY LEVEL  <= regexp_count(('23,34,45,56'),'[^,]+');
Nancy Guruswamy
  • 267
  • 1
  • 3
  • 14
0

Sometimes the number of rows found using IS NOT NULL on the output of REGEXP_SUBSTR is the same as the number of rows from REGEXP_COUNT and then they do the same thing and it is a matter of personal preference:

SELECT REGEXP_SUBSTR( '1,2,3', '\d+', 1, LEVEL ) AS value
FROM   DUAL
CONNECT BY LEVEL <= REGEXP_COUNT( '1,2,3', '\d+' )

and

SELECT REGEXP_SUBSTR( '1,2,3', '\d+', 1, LEVEL ) AS value
FROM   DUAL
CONNECT BY REGEXP_SUBSTR( '1,2,3', '\d+', 1, LEVEL ) IS NOT NULL

both output:

VALUE
-----
1
2
3

However, sometimes the sub-string matched can be zero-width (which is equivalent to NULL) and then they do not:

SELECT REGEXP_SUBSTR( '1,2,,4', '(\d*)(,|$)', 1, LEVEL, 1 ) AS value
FROM   DUAL
CONNECT BY LEVEL <= REGEXP_COUNT( '1,2,3', '(\d*)(,|$)' )

Outputs:

VALUE
------
1
2
(null)
4
(null)

and:

SELECT REGEXP_SUBSTR( '1,2,,4', '(\d*)(,|$)', 1, LEVEL, NULL, 1 ) AS value
FROM   DUAL
CONNECT BY REGEXP_SUBSTR( '1,2,,4', '(\d*)(,|$)', 1, LEVEL, NULL, 1 ) IS NOT NULL

Outputs:

VALUE
------
1
2

and stops at the first NULL value.

To get all the results you either need to use REGEXP_COUNT or not have the REGEXP_SUBSTR() function return NULL values in the CONNECT BY clause (so, in this example we could fix it by extracting the full match rather than just the capturing group; i.e. REGEXP_SUBSTR( '1,2,,4', '(\d*)(,|$)', 1, LEVEL ) IS NOT NULL).

MT0
  • 143,790
  • 11
  • 59
  • 117