0

I'm trying to figure out the correct way to remove all punctuation and white-space from a string but leave apostrophes intact so for example:

won't would remain won't but "desire." would turn into "desire"

I've tried using replaceAll("[\\W]", ""), replaceAll("/\\p{P}(?<!')/", ""), and replaceAll("[^a-zA-Z]", "") but they all leave the punctuation fully intact

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
MileHigh
  • 3
  • 3

1 Answers1

0
String s = "don't.";
s = s.replaceAll("(')|(\\W)", "$1");
M B
  • 307
  • 2
  • 8
  • `\w` character class includes the underscore class. Therefore, [your solution doesn't work](http://ideone.com/0u811d). – BackSlash Nov 11 '16 at 18:13
  • @BackSlash The user did not mention anything related to the underscore. He said he wants to remove punctuation and white space which are both included in the \W class. – M B Nov 11 '16 at 19:00
  • [Here](https://en.wikipedia.org/wiki/Punctuation) Underscore is listed as a punctuation mark used in general typography. Also, the OP was trying to match `\p{P}` in his regex, and [that pattern includes underscore](http://ideone.com/h9eLlY). – BackSlash Nov 12 '16 at 14:31
  • @BackSlash The 3 instructions he mentioned all have different results, and none of them achieve what he wants. The first one he posted was supposed to achieve exactly what I said. If he really wants to remove underscore, he can simply add (_) as third capturing group and it will be removed. If the user feels this does not achieve what he's trying to do he can always reply and enlighten us. – M B Nov 12 '16 at 19:05