2

I have the following string: "WordContainingYes. no yes,- no! yes. no" I need to replace all instance of "yes.", but leave "WordContainingYes." intact. I'm using "\b(yes.)\b" but it doesn't work when there is a punctuation mark inside pattern. So anyone knows how should I match a whole word + punctuation mark after it?

UPDATE

I need to match any punctuation mark after the word. Not only dot.

Thanks

Davita
  • 8,928
  • 14
  • 67
  • 119
  • I don't know about C#, but grep uses `\<` and `\>` to match words. – elyashiv Feb 22 '14 at 17:37
  • 2
    Regex is need escape `\.` – Jones Feb 22 '14 at 17:37
  • @elyashiv could you elaborate a bit more please? – Davita Feb 22 '14 at 17:38
  • How about `\s(yes.)` ? – Bryan Elliott Feb 22 '14 at 17:39
  • @MElliott thanks, your answer is the closest one. It works, but it also removes whitespace before the word (when doing replace). any idea how to fix that? :) – Davita Feb 22 '14 at 17:43
  • @Davita, yes, I supplied an answer that will not replace the space. :) – Bryan Elliott Feb 22 '14 at 17:48
  • @Davita `\p{P}` means [punctuation char](http://msdn.microsoft.com/en-us/library/20bw873z(v=vs.110).aspx). So you can use `[\b\p{P}]` – L.B Feb 22 '14 at 18:39
  • @L.B - Inside of character classes, I think `\b` means backspace character, not really a word boundry. –  Feb 22 '14 at 18:54
  • @sln No. http://stackoverflow.com/questions/6664151/difference-between-b-and-b-in-regex – L.B Feb 22 '14 at 19:00
  • @L.B - Don't know what you mean by 'No'. A word boundry is an assertion. Assertions are allowed in character classes? No .. they are not. `[\b]` matches the backspace character and nothing else. –  Feb 22 '14 at 21:40
  • @sln **No** for `\b means backspace character`. It not mean backspace character in Regex. **The `\b` metacharacter is used to find a match at the beginning or end of a word.** http://www.w3schools.com/jsref/jsref_regexp_begin.asp Is it clear now? – L.B Feb 22 '14 at 21:50
  • @L.B - Yes to [\b] means backspace character. It's a fact, read basic regular expression tutorial. –  Feb 23 '14 at 02:43

5 Answers5

2

You could use this:

(?<=\s)(yes.)

Working regex example:

http://regex101.com/r/dO3rD9

This uses a "lookbehind" for space, so when using replace, the space won't get replaced.

As per OP's comment above: "It works, but it also removes whitespace before the word (when doing replace). any idea how to fix that?"

Bryan Elliott
  • 4,055
  • 2
  • 21
  • 22
  • Thanks. I guess you forgot \ before ., otherwise it works fine :) One more thing, is it possible to math whole word when there is no punctuation mark and match whole word + punctuation when there is one (the latter one is the solution you provided). I mean, to merge \b(word)\b and (?<=\s)(yes\.). Thanks :) – Davita Feb 22 '14 at 17:54
  • Something to write \b(word)\b OR (?<=\s)(yes\.) – Davita Feb 22 '14 at 17:57
  • @Davita, Yes, I always stay away from word boundaries if I can because of problems with special chars. Instead I usually use: `(?<=\s).*?(?=\s)` . That will match whole words, regardless of punctuation or special characters. – Bryan Elliott Feb 22 '14 at 17:58
  • no I didn't explain correctly probably, sorry I'm not native english speaker. What I mean is to try to find by whole word only, such as \b(word)\b and if not found (probably due to punctuation mark), revert back to (?<=\s)(yes\.) – Davita Feb 22 '14 at 18:01
  • @Davita, Oh, you mean like this? `(?<=\s)(yes[^\s]?)` ? This will match " yes," or " yes." or " yes" – Bryan Elliott Feb 22 '14 at 18:08
  • @MElliott - `(?<=\s)(yes[^\s]?)` matches yes + any character that is not a whitespace, thats about 127 minus about 7 characters. –  Feb 22 '14 at 21:48
1

Try this :

\byes\.\b

UPDATE :

\s(yes.?)\s

DEMO : http://regexr.com?38bnn


P.S. . is a special character for regex, meaning "match anything". So it has to be escaped (\.)

Dr.Kameleon
  • 22,532
  • 20
  • 115
  • 223
0

I think that @Jones got the point: . (dot) is a special symbol, and needs to be escaped. Try the following:

\byes\.\b

If you want to natch any punctuation mark, you should use something like that:

\byes[^\w]\b

witch will match yes followed by any non white character. You might want to be more precise and actually write out all the punctuation marks (I assume you don't because you used . before)

elyashiv
  • 3,623
  • 2
  • 29
  • 52
0

This regex should work for you(assuming no unicode on input string):

(?<=\b)yes[^a-zA-Z0-9]
Sabuj Hassan
  • 38,281
  • 14
  • 75
  • 85
0

You could probably use punctuation or word boundry.

note - Have to be carefull when specifying something like this \byes\.\b
on the left hand side is \. a non-word, therefore to match on the right hand side \b
there needs to be a word \w or it won't match.

So, don't do that.

This might work.

\b(yes(?:\p{Punct}|\b))

And with a slight modification, you can exclude certain punctuation like this.
This captures all non-quote punctuation, that will be deleted as part of the replacement, or just matches a word boundry.

\b(yes(?:[^\P{Punct}'"]|\b))

Another alternative is to include just the punctuation you want.

\b(yes(?:[.,+*?-]|\b))