How to not replace the word with in a pattern using regexp_replace?

Question

In the text below, the word number appears twice. I want to not replace the word which appears between a pattern <a hef and a>. Is there a way to avoid the word between this pattern using just the regexp_replace?

The code doesn't work as expected.

with t as (
select 'The Number can be a whole number. <a href https://number.com a>' as text from dual)
select regexp_replace(text,'^[<a href].*number.*[a>]','num') from t

The expected outcome is

The num can be a whole num. <a href https://number.com a>

What would be the other solution apart from Regex is split the string on the base of < and > using SUBSTR and INSTR. And then just use REPLACE function instead of Regex. Easier solution but if there are multiple HTML tags, this would make it tough. We can create REGEX for the same `(?<!\/|\.)number` but Oracle won't accept this kind of regex. You can use this Regex in Regex 101 and it will work but wont work in Oracle. Example: https://regex101.com/r/Bb54Pr/1/ — Deep, Oct 02 '19 at 18:14
You can't do it with a single call to `regexp_replace` because you want to replace the match with a new string and keep the non-match (match inside parentheses). It is only possible with a callback function passed as the replacement argument to regex replace functions, or with lookaheads. — Wiktor Stribiżew, Oct 02 '19 at 18:18
here is the example for the way I mentioned above. Again, it will be helpful only if the HTML tag comes once else it will fail miserably. (It is not dynamic) `with t as ( select 'The number can be a whole number. ' as text from dual) select regexp_replace(SUBSTR(text,1,INSTR(text,'<',1)-1),'(number)','num') || SUBSTR(text, INSTR(text,'<',1)) from t` — Deep, Oct 02 '19 at 18:35
The html tag can be anywhere in the text. The code above works only if the HTML tag is after the text. Right? — ABY, Oct 02 '19 at 19:22

score 0 · Answer 1 · answered Oct 02 '19 at 18:52

I don't know of a way to do it in a single call, but you can do it with multiple calls.

First call: convert the "number" occurrences in the href to a different string
Second call: convert the remaining "number" occurrences
Third call: convert the "different string" occurrences back to "number".

E.g.,

with t as (
select 'The Number can be a whole number. <a href https://number.com a>' as text from dual)
select regexp_replace(
          regexp_replace(
            regexp_replace(text,'(<a href.*)(number)(.*a>)','\1$$$SAVE_NBR$$$\3'),
              'number', 'num'),
            '\$\$\$SAVE_NBR\$\$\$','number')
from t

I don't know why I used "$" in the "different string"... it just makes it harder. The point is to choose a string that can never occur naturally in your input.

score 0 · Answer 2 · answered Oct 02 '19 at 21:09

0

this will work:

with t as (
select 'The Number can be a whole number. <a href https://number.com a>' as text from dual)
select regexp_replace(regexp_replace(text,' Number',' num'),' number',' num') from t

answered Oct 02 '19 at 21:09

Nikhil S

3,786
4
18
32

what if there's another `" number"` between `` with a leading space such as `......... ` ? – Barbaros Özhan Oct 02 '19 at 21:26

How to not replace the word with in a pattern using regexp_replace?

2 Answers2