17

My question concerns the replaceAll method of String class.

My purpose is to replace all the em-dashes in a text with a basic "-". I know the unicode character of em-dash is \u2014.

I tried it in the following way:

String s = "asd – asd";
s = s.replaceAll("\u2014", "-");

Still, the em-dash is not replaced. What is it I'm doing wrong?

user975705
  • 311
  • 1
  • 4
  • 11

4 Answers4

32

Minor edit after question edit:

You might not be using an em-dash at all. If you're not sure what you have, a nice solution is to simply find and replace all dashes... em or otherwise. Take a look at this answer, you can try to use the Unicode dash punctuation property for all dashes ==> \\p{Pd}

String s = "asd – asd";
s = s.replaceAll("\\p{Pd}", "-");

Working example replacing an em dash and regular dash both with the above code.

References:
public String replaceAll(String regex, String replacement)
Unicode Regular Expressions

Community
  • 1
  • 1
Peter Ajtai
  • 56,972
  • 13
  • 121
  • 140
3

Based on what you posted, the problem may not actually lie with your code, but with your assumed dash. What you have looks like an en dash (width of a capital N) rather than an em dash (width of a capital M). The Unicode for the en dash is U+2013, try using that instead and see if it updates properly.

Charlie
  • 66
  • 4
2

String.replaceAll takes a regex as its first parameter. If you just want to replace all occurences of a single char by another char, consider using String.replace(char, char):

String s = "asd – asd";
s = s.replace('\u2014', '-');
Vivien Barousse
  • 20,555
  • 2
  • 63
  • 64
1

It works fine for me. My guess is you're not using an em-dash. Test copy-pasting the em-dash character from the character map instead of word.

JRL
  • 76,767
  • 18
  • 98
  • 146