2

In the text below, the word number appears twice. I want to not replace the word which appears between a pattern <a hef and a>. Is there a way to avoid the word between this pattern using just the regexp_replace?

The code doesn't work as expected.

with t as (
select 'The Number can be a whole number. <a href https://number.com a>' as text from dual)
select regexp_replace(text,'^[<a href].*number.*[a>]','num') from t

The expected outcome is

The num can be a whole num. <a href https://number.com a>
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
ABY
  • 393
  • 2
  • 11
  • What would be the other solution apart from Regex is split the string on the base of < and > using SUBSTR and INSTR. And then just use REPLACE function instead of Regex. Easier solution but if there are multiple HTML tags, this would make it tough. We can create REGEX for the same `(?<!\/|\.)number` but Oracle won't accept this kind of regex. You can use this Regex in Regex 101 and it will work but wont work in Oracle. Example: https://regex101.com/r/Bb54Pr/1/ – Deep Oct 02 '19 at 18:14
  • You can't do it with a single call to `regexp_replace` because you want to replace the match with a new string and keep the non-match (match inside parentheses). It is only possible with a callback function passed as the replacement argument to regex replace functions, or with lookaheads. – Wiktor Stribiżew Oct 02 '19 at 18:18
  • here is the example for the way I mentioned above. Again, it will be helpful only if the HTML tag comes once else it will fail miserably. (It is not dynamic) `with t as ( select 'The number can be a whole number. ' as text from dual) select regexp_replace(SUBSTR(text,1,INSTR(text,'<',1)-1),'(number)','num') || SUBSTR(text, INSTR(text,'<',1)) from t` – Deep Oct 02 '19 at 18:35
  • The html tag can be anywhere in the text. The code above works only if the HTML tag is after the text. Right? – ABY Oct 02 '19 at 19:22

2 Answers2

0

I don't know of a way to do it in a single call, but you can do it with multiple calls.

  • First call: convert the "number" occurrences in the href to a different string
  • Second call: convert the remaining "number" occurrences
  • Third call: convert the "different string" occurrences back to "number".

E.g.,

with t as (
select 'The Number can be a whole number. <a href https://number.com a>' as text from dual)
select regexp_replace(
          regexp_replace(
            regexp_replace(text,'(<a href.*)(number)(.*a>)','\1$$$SAVE_NBR$$$\3'),
              'number', 'num'),
            '\$\$\$SAVE_NBR\$\$\$','number')
from t

I don't know why I used "$" in the "different string"... it just makes it harder. The point is to choose a string that can never occur naturally in your input.

Matthew McPeak
  • 17,705
  • 2
  • 27
  • 59
0

this will work:

with t as (
select 'The Number can be a whole number. <a href https://number.com a>' as text from dual)
select regexp_replace(regexp_replace(text,' Number',' num'),' number',' num') from t
Nikhil S
  • 3,786
  • 4
  • 18
  • 32