ASP.NET Regex.Replace expression and method to replace all enclosed . characters

Question

I have this string:

The.Soundcraft.Si.Performer.1.is.digital.19.3.inch.mix.actually this is. a test

In this string I want to replace . characters that have a character directly before AND after it with . (so a trailing space) UNLESS the leading or trailing character is a number or space.

The end result would be:

The. Soundcraft. Si. Performer. 1.is. digital. 19.3. inch. mix. actually this is. a test

I tested my regex ([^0-9 ])\.([^0-9 ]) here: http://www.regexr.com/ and it seems to match all parts I need replaced.

So I coded this:

dim description as String = "The.Soundcraft.Si.Performer.1.is.digital.19.3.inch.mix.actually this is. a test"
description = Regex.Replace(description, "([^0-9 ])\.([^0-9 ])", ". ")

But nothing happens. What am I missing?

Try `description = Regex.Replace(description, "\b\.\b", ". ")` — Wiktor Stribiżew, Feb 02 '16 at 21:57
Missing `@` in beginning of your regex. Should be `@"([^0-9 ])\.([^0-9 ])"`. — , Feb 02 '16 at 21:59
@WiktorStribiżew: Thanks! You're suggestion works...although I'm unsure why (I was reading this too http://stackoverflow.com/questions/6664151/difference-between-b-and-b-in-regex). Is `\b` just excluding numbers (which I want) or also (some) special characters (which I don't want)? — Adam, Feb 02 '16 at 22:10
`\b` is a so-called word boundary. Read all about it here: http://www.regular-expressions.info/wordboundaries.html — Jeroen, Feb 03 '16 at 03:35
@noob asp.net tag doesn't make answering a VB question with C# syntax useful. From the code given by the OP it' seems pretty obvious to me which language this is. Unless Microsoft decided to add Dim to C# as a keyword while I was asleep. — Jeroen, Feb 03 '16 at 03:39

score 1 · Accepted Answer · answered Feb 03 '16 at 07:15

You can use

description = Regex.Replace(description, "\b\.\b", ". ")

The regex demo here

Why does it work?

The word boundary \b can have 4 meanins depending on the context:

(?<!\w) in a construct like \b + word letter ([\p{L}\p{N}_])
(?<!\W) in a construct like \b + non-word letter ([^\p{L}\p{N}_])
(?!\w) in a construct like word letter ([\p{L}\p{N}_]) + \b
(?!\W) in a construct like non-word letter ([^\p{L}\p{N}_]) + \b.

In your case, the 2nd and 4th cases apply: the . is a non-word character, thus \b\.\b is the same as (?<!\W)\.(?!\W): match a dot that is enclosed with word characters.

EDGE CASE:

If you do not want to replace . that is next to _, you need to exclude the _ from the word boundary, and this is how it would look then:

(?<![^\p{L}\p{N}])\.(?![^\p{L}\p{N}])

See demo

ASP.NET Regex.Replace expression and method to replace all enclosed . characters

1 Answers1