0

I have a string with multiple dashes, but it contains long dashes.

What method can I use to normalize dashes?

text = Regex.Replace(text, @"(\u2012|\u2013|\u2014|\u2015)", "-");

The expected output is something like 11-1111-11/11 The actual is almost the same, but some of the dashes are long ones. (I can't put in that dash because the stackoverflow does not recognize it.) enter image description here

tixovoxi
  • 171
  • 4
  • 11

2 Answers2

8

This works:

 private const string DashPattern = @"[\u2012\u2013\u2014\u2015]";
 private static Regex _dashRegex = new Regex(DashPattern);

 public static string RemoveLongDashes(string s)
 {
     return _dashRegex.Replace(s, "-");
 }

Your expression with the pipe characters (|) is not a valid regex expression. If you want to replace all of the vowels, you use an expression like @"[aeiou]", i.e., the choices within a set of square brackets.

Flydog57
  • 6,851
  • 2
  • 17
  • 18
  • 2
    I'm intrigued why this was down-voted. It provides a way to convert all of the various Unicode dash characters (Figure Dash, En Dash, Em Dash and Horizontal Bar (U+2012 through U+2015)) into a plain old ASCII-ish hyphen. That was the gist of the original question – Flydog57 Aug 26 '19 at 13:59
  • Thanks man... this should be the answer – VR1256 Mar 31 '20 at 18:29
1

Here is some info on the em dash. You might be able to copy and paste the dash from this post into your code and use the string.replace

The em dash

Look in the following SO post for the answer:

replacing the em dash

Looks like the following code solved the issue for others:

String s = "asd – asd";
s = s.replaceAll("\\p{Pd}", "-");
Fractal
  • 1,748
  • 5
  • 26
  • 46