-2

How do I get Regex.Replace to replace with wildcard but a maximum length on wildcard?

Such as using the word 'Murdered':

It will replace

M(.*{legnth of 6})d

but not

M(.*{legnth of 5})d

unless

M(.*{legnth of 5})ed

Any help appreciated!

Edit 1: I am using it like this:

s = Regex.Replace(sX, ss, sr);

edit 2: I tried the following

ss = M(.{1,6})n sr = a, $1
on the word 'mutation'

all that is return is

a, utatio

should it not be

ma, utation

?

Edit 3: NVM, I just realize that it search and replace the whole ss string...

Edit4: How do I use

\p{L}

in it? It is kind of relevant because because this minimal and maximum length does not work on anything but english... and now it is useless to me...

1 Answers1

2

Use curly bracketed numbers:

M.{5}d //M, then 5 of anything, then d
M.{5,7}d //M, then between 5 and 7 of anything, then d
M.{,5}d //M, then upto 5 of anything, then d
M.{5,}d //M, then at least 5 of anything, then d

In regex {x,y} means "between x and y of the previous character/character class". If x is not specified, 0 is assumed. If y is not specified, infinity is assumed. If the comma is not specified, the single number means "exactly this amount of characters". The only thing that probably won't work out in all engines (maybe in any engines) is not specifying either x or y. For example, in Python {,} just matches bracket comma bracket, it's not a range between 0 and infinity

PS: If you think about it, these are equivalent:

//zero or more of previous 
.*
.{0,}

//one or more of previous
.+
.{1,}

You can also apply the ? Pessimistic modifier to a ranged quantifier. Both these mean "match an M, then at least one but as few chars as possible, then a d", i.e.. they will match the "Murd" in "Murdered"

M.+?
M.{1,}?d 

Edit 2:

As per Seb's comment, because Murdered contains two d characters, it achieves a match with M.{1,5}d:

enter image description here

You'll have to do something else, like edit the regex so it finishes with a \b to match the end of the word. This will prevent it matching the partial murd, but still allow a match on mud mind meld marred etc

Edit 3:

. is frequently a poor choice of character to use - "match anything" is often more than you want. If you want to match only word characters, or everything that isn't whitespace, you can use those character classes instead:

M\w{1,5}d
M\S{1,5}d

Always try to find ways of avoiding . especially if you're seeing problems of more things matching than you want

Caius Jard
  • 72,509
  • 5
  • 49
  • 80
  • well, it did not work; I am using it like such: s = Regex.Replace(sX, ss, sr); – user3014330 Jul 02 '20 at 05:43
  • 5
    @user3014330: "It did not work" isn't particularly useful feedback. If you'd said "I tried using a regex of X on source Y with a replacement of Z; I expected R0 but got R1" (with values for all of those) then that would be useful. At the moment, the only response can be "well, it should work, you're probably doing something wrong, but we can't tell what". – Jon Skeet Jul 02 '20 at 05:47
  • If you want to match "murdered" you'll need to quantify 6, not 5, because "urdere" is six characters. I provided the whole post as guidance and teaching as to how to use numbered ranges rather maybe to solve the exact problem of "murdered". – Caius Jard Jul 02 '20 at 05:48
  • edited above... – user3014330 Jul 02 '20 at 05:49
  • Edited my answer – Caius Jard Jul 02 '20 at 05:58
  • sorry, i change a word since the d is repeated... I have edited it – user3014330 Jul 02 '20 at 06:01
  • well, I was having problem with (.*) as it end up matching the whole sentence since it is not in english. I have edited the main issue I am having right now above. should it not place the replaced in between the M and N? – user3014330 Jul 02 '20 at 06:04
  • Could you please read my answer and think about what I'm teaching; I'm trying to get you to a point where you understand what is going on and *why* all this does/doesn't work.. and you're just going "solve this problem for me, no wait, solve that problem for me, no wait solve another problem for me". If I'm doing a bad job of teaching this, let me know but there is nothing to be gained from me just doing your work for you, and everything to be gained from you learning so you can solve these trivial problems yourself – Caius Jard Jul 02 '20 at 06:05
  • As mentioned at the very end of my answer (above edit 3), put a \b word boundary marker in – Caius Jard Jul 02 '20 at 06:06
  • I'll take note of it, thanks – user3014330 Jul 02 '20 at 06:12
  • How do I add {Lo} to (.{1,5})? I know this is not part of the question but... – user3014330 Jul 02 '20 at 06:39
  • If you mean `\p{Lo}` as in "other Unicode letter not having a lower or upper case variation" then thats a character class like `[a-z]` is, so `\p{Lo}{1,5}` means "between 1 and 5 characters in the \p{Lo} class` – Caius Jard Jul 02 '20 at 07:47