2

I've an issue like this: string filter: detect non-ASCII signs in C# but I should exclude all no-printable characters in a string except new line chars (\n).

Starting from Regex option:

foo = System.Text.RegularExpressions.Regex.Replace(foo, @"[^\u0020-\u007E]+", string.Empty);

I've modified it in this way:

foo = System.Text.RegularExpressions.Regex.Replace(foo, @"[\u0000-\u0009\u000B-\u000C\u000E-\u0019\u007F]+", string.Empty);

This seems to work correctly, but could you suggest a less verbose solution? Thanks in advance

user2595496
  • 111
  • 1
  • 7

1 Answers1

3
Regex regex = new Regex(@"\p{C}+");    
string strWithPrintableChars = string.Join('\n'.ToString(),
                    foo.Split('\n').Select(line => regex.Replace(line, "")));

Explanation:

Mustafa R
  • 73
  • 1
  • 8
  • 1
    Please add an explanation so that it's easier for posterity to understand – rdas Apr 16 '19 at 08:35
  • In the context of .net core and multiplatforms I would suggest to change this part: foo.Split('\n') to foo.Split(new char[] { '\n', '\r' }, StringSplitOptions.RemoveEmptyEntries) and string.Join('\n'.ToString() to string.Join(Environment.NewLine, foo...... – Sebastian Widz Oct 15 '19 at 22:45
  • Thanks Sebastian, actually the question was asked for '\n' explicitly and has been used in the answer to explain. However, please find my remark on your points. 1. "foo.Split('\n') to foo.Split(new char[] { '\n', '\r' }" -- this is better if we are dealing with both line feed and carriage return. 2. "string.Join(Environment.NewLine, foo......" -- this should not be used because it may induce '\r' in the result on windows which is not the objective of the question. – Mustafa R Oct 29 '19 at 06:07