1
  • i need to replace all occurrences of a string within another string, if the original string matches some filter
  • i can only use a single regex using an s command, because i need to send the assembled command to a 3rd party API

i have tried to use positive lookahead as to not consume the string in which i want to replace characters, but somehow i can not get the replacing to work as expected.

here is what i have tried so far and what was the outcome: (note that the filter - here [0-9]+ is just an example and will be passed in from the call site and i can not directly influence it.

expected result: 9999997890

perl -e '$x = "4564567890"; $x =~ s/(?=^[0-9]+$)456/999/g; print $x'

actual result: 9994567890

  1. this replaces only the first occurrence of 456. why is this happening?
  2. even less understandable for me is that if i change the filter lookahead to (?=.*), both occurrences of 456 are being replaced. why does changing the filter have any effect on the replacing portion of the regex?

i seem to be missing some very basic point about how mixing filtering and replacing stuff in one s command works.

  • 1
    I think you need `s/(?:\G(?!^)|^(?=\d+$))\d*?\K456/999/g` – Wiktor Stribiżew Dec 11 '19 at 10:17
  • @WiktorStribiżew this seems to work...can you write it as an answer so i can accept it? also, would you care to explain why your regex works and mine does not? :D – Armin Walland Dec 11 '19 at 10:27
  • Probably simpler to understand if you could use eg `^.*?\D.*(*SKIP)(*F)|456` to [skip](https://stackoverflow.com/questions/24534782/how-do-skip-or-f-work-on-regex) strings that don't contain only digits. – bobble bubble Dec 11 '19 at 10:38

2 Answers2

2

Your regex only replaces the 456 that is at the start of the string that only consists of digits.

You may use

s/(?:\G(?!^)|^(?=\d+$))\d*?\K456/999/g

See the regex demo

Pattern details

  • (?:\G(?!^)|^(?=\d+$)) - a custom boundary that matches either the end of the previous successful match (\G(?!^)) or (|) the start of string (^) that only contains digits ((?=\d+$))
  • \d*? - 0+ digits, but as few as possible
  • \K - omit the currently matched chars
  • 456 - a 456 substring.

The idea is:

  • Use the \G based pattern to pre-validate the string: (?:\G(?!^)|^(?=<YOUR_VALID_LINE_FORMAT>$))
  • Then adjust the consuming pattern after the above one.
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
1

Alternatively you can probably use (*SKIP)(*F) to skip strings not composed only of digits .

s/^\d*\D.*(*SKIP)(*F)|456/999/g

See this demo at regex101 or your demo at tio.run

The left part ^\d*\D.* tries to match any \D non digit. If found, skips .* rest of the string and fails | OR matches the specified substring 456.

bobble bubble
  • 16,888
  • 3
  • 27
  • 46