17

I've got some horrible text that I'm cleaning up using several c# regular expressions. One issue that has me stumped is there are a number of '\r\n' strings in the text, the actual characters not the line breaks.

I've tried:

content = Regex.Replace(content, "\\r\\n", "");

and:

content = Regex.Replace(content, "\r\n", "");

but neither of them work. In the end I had to use:

content = content.Replace("\\r\\n", "\r\n");

to get the project finished, but not being able to do it in a regex annoys me.

Skrealin
  • 1,114
  • 6
  • 16
  • 32
  • could this help? http://stackoverflow.com/questions/1981947/how-can-i-remove-r-n-from-a-string-in-c-can-i-use-a-regex – SubniC Nov 30 '10 at 08:44
  • 3
    content.Replace(@"\r\n", "\r\n") is your best choice. – VVS Nov 30 '10 at 08:46
  • 1
    @Jens: Of course. I meant to say that it's the best choice and far better that using a regex for such a trivial task. – VVS Nov 30 '10 at 09:00
  • This question answered here pl check and good solution: https://stackoverflow.com/a/1982317/2208645 – Suraj Bhatt Dec 10 '20 at 07:03

7 Answers7

27

\r, and \n have special meaning in Regex, too, so the backslash needs to be escaped. Then, these backslashes needs to be escaped for the c# string, leading to

content = Regex.Replace(content, "\\\\r\\\\n", ""); 

or

content = Regex.Replace(content, @"\\r\\n", ""); 
Jens
  • 25,229
  • 9
  • 75
  • 117
7

It is a good idea to get into the habit of using a verbatim string literals (@"example") when writing regular expressions in C#. In this case you needed this:

content = Regex.Replace(content, @"\\r\\n", "\r\n");

Otherwise you have to escape each backslash twice: once to escape it in the C# string, and then a second time to escape them for the regular expression. So a single backslash would become four backslashes with a standard string literal.

Mark Byers
  • 811,555
  • 193
  • 1,581
  • 1,452
3
content = Regex.Replace(content, "\\\\r\\\\n", "");

might work. More info here.

Quote:

In literal C# strings, as well as in C++ and many other .NET languages, the backslash is an escape character. The literal string "\\" is a single backslash. In regular expressions, the backslash is also an escape character. The regular expression \\ matches a single backslash. This regular expression as a C# string, becomes "\\\\". That's right: 4 backslashes to match a single one.

Note: I had to write 8 backslashes in the next to last sentence so 4 backslashes would get displayed ;-)

darioo
  • 46,442
  • 10
  • 75
  • 103
3

A better & simple answer is here. It works for me using Regex.

public static string GetMultilineBreak(this string content)
{
    return Regex.Replace(content, @"\r\n?|\n", "<br>"); 
}
Manjunath Bilwar
  • 2,215
  • 19
  • 16
2

Within a specified input string, Regex.Replacereplaces strings that match a regular expression pattern with a specified replacement string.

A typical usage would be

  string input = "This is   text with   far  too     much   " +  "   whitespace.";
  string pattern = "\\s+";
  string replacement = " ";
  Regex rgx = new Regex(pattern);
  string result = rgx.Replace(input, replacement);

Doesn't seem like that's what you are trying to do.

Robin Maben
  • 22,194
  • 16
  • 64
  • 99
0

The Question is old but there has been a change.

string temp = Regex.Replace(temp, "\\n", " ");

or better enough

string temp = Regex.Replace("tab    d_space  newline\n content here   :P", @"\s+", " ");
//tab d_space newline content here :P

This works on Universal Windows Applications and probably others too.

Rohit Hazra
  • 657
  • 9
  • 27
-3

Wild guess here:

var bslash = System.IO.Path.DirectorySeparatorChar.ToString();

content = content.Replace(bslash + "r" + bslash + "n", "");
Rick Rat
  • 1,732
  • 1
  • 16
  • 32