I want to remove all strange special characters from a string in Java. Those strange special characters are appearing in form of ?(Question mark) in MS Word.The image of sample string is given below.
Asked
Active
Viewed 2,934 times
7
-
1Learn about Unicode and UTF-8. – duffymo Mar 01 '16 at 10:14
-
3and check http://stackoverflow.com/a/8519863/2166188 – Michal_Szulc Mar 01 '16 at 10:16
-
BTW define *strange*. Do you want to remove all non-ascii characters? – TheLostMind Mar 01 '16 at 10:18
-
Those characters not removed by using textToConvert.replaceAll("[^\\x00-\\x7F]", "") – psms Mar 01 '16 at 10:33
-
1Then use: `textToConvert.replaceAll("[\\x00-\\x7F]", "")` I don't see your problem. – ctst Mar 01 '16 at 10:36
2 Answers
2
You can use
String newString = my_string.replaceAll("\\p{C}", "");
more information about Java Unicode Regular expression Java Unicode Regular expression here

Emdadul Sawon
- 5,730
- 3
- 45
- 48
-
I suggest .replaceAll("[!@#$%ˆ&*/\\(\\)\\{\\};:<>/?,.|\\[\\]]", ""); it will keep é, á,ç ã for other languages – Ricardo Gellman Jan 18 '19 at 12:46
1
This will work:
String string = yourString.replaceAll("[^\\x00-\\x7F]", "");

Santosh Jadi
- 1,479
- 6
- 29
- 55