78

I got this query and want to extract the value between the brackets.

select de_desc, regexp_substr(de_desc, '\[(.+)\]', 1)
from DATABASE
where col_name like '[%]';

It however gives me the value with the brackets such as "[TEST]". I just want "TEST". How do I modify the query to get it?

Henley
  • 21,258
  • 32
  • 119
  • 207

2 Answers2

127

The third parameter of the REGEXP_SUBSTR function indicates the position in the target string (de_desc in your example) where you want to start searching. Assuming a match is found in the given portion of the string, it doesn't affect what is returned.

In Oracle 11g, there is a sixth parameter to the function, that I think is what you are trying to use, which indicates the capture group that you want returned. An example of proper use would be:

SELECT regexp_substr('abc[def]ghi', '\[(.+)\]', 1,1,NULL,1) from dual;

Where the last parameter 1 indicate the number of the capture group you want returned. Here is a link to the documentation that describes the parameter.

10g does not appear to have this option, but in your case you can achieve the same result with:

select substr( match, 2, length(match)-2 ) from (
SELECT regexp_substr('abc[def]ghi', '\[(.+)\]') match FROM dual
);

since you know that a match will have exactly one excess character at the beginning and end. (Alternatively, you could use RTRIM and LTRIM to remove brackets from both ends of the result.)

Glorfindel
  • 21,988
  • 13
  • 81
  • 109
Dave Costa
  • 47,262
  • 8
  • 56
  • 72
  • 9
    What's amazing is that 6th parameter is not mentioned in Oracle's official REGEXP_SUBSTR documentation. Thanks for pointing out it exists. – Major Major Mar 08 '18 at 17:20
  • 4
    It looks like google will take you to old documentation for REGEXP_SUBSTR, but if you are able to find 11g documentation you can see the 6th parameter: https://docs.oracle.com/cd/B28359_01/server.111/b28286/functions138.htm#SQLRF06303 – BurnsBA Apr 25 '19 at 14:51
27

You need to do a replace and use a regex pattern that matches the whole string.

select regexp_replace(de_desc, '.*\[(.+)\].*', '\1') from DATABASE;
ivanatpr
  • 1,862
  • 14
  • 18
  • 2
    IMHO this is the simplest, easy to remember, more flexible and therefore the best way to do it. – Robert Lujo Jan 28 '15 at 11:40
  • 16
    I would caution anyone using `REGEXP_REPLACE` to get the capture group that if the pattern is not matched, Oracle will return the entire value, whereas the behavior you probably want is for it to return `null`. For example, `REGEXP_REPLACE ('abcdefghi', '.*\[(.+)\].*', '\1')` (pattern is not matched) returns `abcdefghi`. This tripped me up once. – Jared Feb 03 '15 at 16:10
  • The problem here is that Oracle regexp does not provide a function to return the part of the string matched by a capturing group. You can try using a recursive query (select ... from tab connect by ...) but be awre of the potential performance issue. – Cyryl1972 Mar 04 '16 at 07:52
  • To get around the problem @Jared mentions, you can do an "or everything" `|^.*$` capture. `regexp_replace(de_desc, '.*\[(.+)\].*|^.*$', '\1')`. If nothing is captured in capture group 1, then the entire string will be replaced with null. – Therkel Apr 08 '22 at 09:28