Note:
The observed behavior is correct, but may at first be surprising; it was to me, and I think it may be to others as well - though probably not to those intimately familiar with regex engines.
The repeatedly suggested duplicate, Regex lookahead, lookbehind and atomic groups, contains general information about look-around assertions, but does not address the specific misconception at hand, as discussed in more detail in the comments below.
Using a greedy, by definition variable-width subexpression inside a positive look-behind assertion can exhibit surprising behavior.
The examples use PowerShell for convenience, but the behavior applies to the .NET regex engine in general:
This command works as I intuitively expect:
# OK:
# The subexpression matches greedily from the start up to and
# including the last "_", and, by including the matched string ($&)
# in the replacement string, effectively inserts "|" there - and only there.
PS> 'a_b_c' -replace '^.+_', '$&|'
a_b_|c
The following command, which uses a positive look-behind assertion, (?<=...)
, is seemingly equivalent - but isn't:
# CORRECT, but SURPRISING:
# Use a positive lookbehind assertion to *seemingly* match
# only up to and including the last "_", and insert a "|" there.
PS> 'a_b_c' -replace '(?<=^.+_)', '|'
a_|b_|c # !! *multiple* insertions were performed
Why isn't it equivalent? Why were multiple insertions performed?