1

I've reviewed this question and I'm wondering my output seems to be a little skewed.

From my understanding the REGEXP_REPLACE method, takes a string that you want to replace content with, followed by a pattern to match, then anything that does not match that pattern is replaced with the substitution param.

I've written the following function to extract distance from a text field, in which a spatial query will be performed on the result.

CREATE OR REPLACE FUNCTION extract_distance
(
    p_search_string   VARCHAR2
) 
    RETURN VARCHAR2
IS
    l_distance        VARCHAR2(25);
BEGIN

    SELECT REGEXP_REPLACE(UPPER(p_search_string), '(([0-9]{0,4}) ?MILES)', '') 
    INTO l_distance FROM SYS.DUAL;

    RETURN l_distance;
END extract_distance;

When I run this in a block to test:

DECLARE
    l_output VARCHAR2(25);
BEGIN
    l_output := extract_distance('Stores selling COD4 in 400 Miles');
    DBMS_OUTPUT.PUT_LINE(l_output);     
END;

I'd expect the output 400 miles but in-fact I get Stores selling COD4 in. Where have I gone wrong?

Community
  • 1
  • 1
Halfpint
  • 3,967
  • 9
  • 50
  • 92
  • I don't know Oracle, but it looks like your code is removing the Regex pattern (i.e. Replace the pattern with ''). I think you need to negate the Regex, i.e replace the part that is not the pattern with the empty string – Sparky Dec 06 '14 at 15:55
  • Thanks Sparky, those were my thoughts too, however in the question I referenced the opposite behaviour is demonstrated. Maybe I should write a subsitution expression to override the first part of the string. – Halfpint Dec 06 '14 at 15:57
  • @Alex: In that question the regular expression *is* negated. `[^0-9]` doesn't match a digit, it matches a character that is not a digit. The `^` at the beginning of the set makes it a negative set. – Guffa Dec 06 '14 at 16:03
  • @Guffa ah that is very interesting - Regex isnt my best area, so that would make sense – Halfpint Dec 06 '14 at 16:09
  • Performance wise, `SUBSTR` and `INSTR` approach would be efficient and less resource consuming than `REGEXP`. – Lalit Kumar B Dec 07 '14 at 03:05

2 Answers2

3

"REGEXP_REPLACE extends the functionality of the REPLACE function by letting you search a string for a regular expression pattern. By default, the function returns source_char with every occurrence of the regular expression pattern replaced with replace_string." from Oracle docu

You could use, e.g.,
SELECT REGEXP_REPLACE('Stores selling COD4 in 400 Miles', '^.*?(\d+ ?MILES).*$', '\1', 1, 0, 'i') FROM DUAL;

or alternatively
SELECT REGEXP_SUBSTR('Stores selling COD4 in 400 Miles', '(\d+ ?MILES)', 1, 1, 'i') FROM DUAL;

Abecee
  • 2,365
  • 2
  • 12
  • 20
0

You'll want to use, regexp_substr which returns a substring that matches the regular expression.

REGEX_SUBSTR

Nick
  • 2,524
  • 17
  • 25