2

I am trying to write a regex-replace pattern in order to replace a number in a hash like such:

regexr link

some_dict = {
  TEST: 123
}

such that 123 could be captured and replaced.

(?<= |\t*[a-zA-Z0-9_]+: |\t+)\d+(?=.*)

You'll see that this works perfectly fine in regexr: enter image description here

When I run this gsub in irb, however, here is what happens:

irb(main):005:0> "  TEST: 123".gsub(/(?<= |\t*[a-zA-Z0-9_]+: |\t+)\d+(?=.*)/, "321")
SyntaxError: (irb):5: invalid pattern in look-behind: /(?<= |\t*[a-zA-Z0-9_]+: |\t+)\d+(?=.*)/

I was looking around for similar issues like Invalid pattern in look-behind but I made sure to exclude capture groups in my look-behind so I'm really not sure where the problem lies.

notacorn
  • 3,526
  • 4
  • 30
  • 60

1 Answers1

2

The reason is that Ruby's Onigmo regex engine does not support infinite-width lookbehind patterns.

In a general case, positive lookbehinds that contain quantifiers like *, + or {x,} can often be substituted with a consuming pattern followed with \K:

/(?: |\t*[a-zA-Z0-9_]+: |\t+)\K\d+(?=.*)/
#^^^                         ^^  

However, you do not even need that complicated pattern. (?=.*) is redundant, as it does not require anything, .* matches even an empty string. The positive lookbehind pattern will get triggered if there is a space or tab immediately to the left of the current location. The regex is equal to

.gsub(/(?<=[ \t])\d+/, "321")

where the pattern matches

  • (?<=[ \t]) - a location immediately preceded with a space/tab
  • \d+ - one or more digits.
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • if there is any chance of a number appearing after `123` like in a comment for example, do i need to add a lookahead as well? – notacorn May 22 '21 at 00:17
  • 1
    @notacorn Positive lookaheads are used to require a specific string context. `.*` matches an empty string, so your positive lookahead is redundant. – Wiktor Stribiżew May 22 '21 at 00:19