0

I'm trying to add an "s" to the end of a string. I know there are some easier ways to append something to a string but for my need I have to work with a regex.

I currently have this:

string str = "test";
string res = Regex.Replace(str, "(.*)$", "$1s");

This gives me:

testss

Why is there an extra "s" at the end?

André Kool
  • 4,880
  • 12
  • 34
  • 44
Dany M
  • 760
  • 1
  • 13
  • 28
  • `.*` can match an empty string, that is why. There are two matches in the string with your pattern. – Wiktor Stribiżew Feb 19 '18 at 10:44
  • 1
    @WiktorStribiżew Isn’t the match greedy? – bfontaine Feb 19 '18 at 10:44
  • @bfontaine It does not matter, greedy or not. There are two matches, there are two replacements, it is not a bug. – Wiktor Stribiżew Feb 19 '18 at 10:45
  • 3
    @WiktorStribiżew Well it does, because the greedy-ness determinates the number of matches. If it weren’t greedy, you’d get a match on an empty string, then a match on `t`, then an empty string, then `e`, then an empty string, then `s`, etc. – bfontaine Feb 19 '18 at 10:47
  • @bfontaine *Greediness* has nothing to do **here**. The point is that `.*` matches any chars other than newline up to the end of string (or the last newline char) and then matches the end of string. Using [`var rx = new Regex("(.*)$"); var res = rx.Replace("test", "$&s", 1);`](https://ideone.com/A7W0La) would fix it since only 1 replacement would occur. – Wiktor Stribiżew Feb 19 '18 at 10:50
  • 3
    @WiktorStribiżew wouldn't `(.+)$` be an easier fix? – Bill Tür stands with Ukraine Feb 19 '18 at 10:54
  • @ThomasSchremser No, the easiest fix is [`Regex.Replace(s, @"\z", "s")`](https://ideone.com/QSbcAR). There is no need matching anything before the end of string. And in case there is a trailing newline, `\z` is more appropriate than `$`. Anyway, OP is not asking for a workaround or fix, just the reason. And I explained the reason in my comments, and linked to a post that explains that in detail. This question is asked regularly, no need to repeat what has been already said. – Wiktor Stribiżew Feb 19 '18 at 11:01
  • I'd say that the simplest solution (matching the question) is to replace `$` with `s`. [Here at regexstorm](http://regexstorm.net/tester?p=%24&i=test&r=s) – SamWhan Feb 19 '18 at 11:08
  • 1
    @ClasG Simple does not mean correct. If the input is `"test\n"`, the output is `"tests\ns"`. See https://ideone.com/KQhurq, yes, still 2 replacements. Do not rely on online regex testers. I am really tired of repeating that. – Wiktor Stribiżew Feb 19 '18 at 11:11
  • @WiktorStribiżew Your reasoning is only true if the multi line option is chosen. The question says add "to the end of a string". W/o multi line, that's exactly what `$` matches. – SamWhan Feb 19 '18 at 11:35
  • @ClasG No, you are wrong. See again https://ideone.com/KQhurq - do you see `RegexOptions.Multiline` anywhere? Also, see [MDN Regex docs](https://learn.microsoft.com/en-us/dotnet/standard/base-types/regular-expression-language-quick-reference#atomic_zerowidth_assertions): *`$` - The match must occur at the end of the string or before \n at the end of the line **or string**.* – Wiktor Stribiżew Feb 19 '18 at 11:37
  • @WiktorStribiżew Sorry to say, but you're right again ;). I do however think the docs are ambiguous, as [this](https://learn.microsoft.com/en-us/dotnet/standard/base-types/regular-expression-options#Multiline) is what MS docs says about multi-line: "*It changes the interpretation of the `^` and `$` language elements so that they match the beginning and end of a line, instead of the beginning and end of the input string.*" – SamWhan Feb 19 '18 at 11:47
  • @ClasG I agree that in that very place, MDN authors just left out that *or before `\n` at the end of the string* part. Maybe for brevity. I don't know. – Wiktor Stribiżew Feb 19 '18 at 11:50
  • @bfontaine I think you (and your comment upvoters) confused *greediness* for the `+` operator. `+` (one or more repetitions) and `*` (zero or more repetitions) are two different operators, and greediness really has nothing to do with this question. Here, both `.*?$` and `.*$` (that is, same quantifiers with different *greediness*) will produce the same result. – Wiktor Stribiżew Feb 19 '18 at 11:54
  • @WiktorStribiżew No confusion here; it depends on your regex engine implementation. The piece of code above *does* yield `"tests"` in some languages: https://gist.github.com/bfontaine/6d55ff9009bce1a92dd3d44e5c4cdfd8 I forgot about the other implementations :) – bfontaine Feb 19 '18 at 13:56
  • @bfontaine I know it works in some languages. It works in JS, because `$` in JS regex matches the very end of string, same as `\z` does in Perl, .NET, Java, Ruby regex and same what `\Z` matches in Python `re`. The question is about .NET regex, and again greediness has nothing to do here. – Wiktor Stribiżew Feb 19 '18 at 13:59

0 Answers0