1

I have the following String passed from another application.

2�4�9�

(2�4�9�)

I would like to remove question mark ascii characters from the above string.

How could I do this?

Jacob
  • 14,463
  • 65
  • 207
  • 320
  • 3
    These are not question marks but replacements for Unicode characters that can not be represented in ASCII. Fix the code that produces this string. –  May 14 '15 at 07:00
  • @Tichodroma I'm receiving String from an application which I do not have any control. – Jacob May 14 '15 at 07:02
  • 2
    I fear that I'm this case you will not receive any useful data. There must have been something with a meaning where now the replacements are. This will be lost, your data will be corrupt. Fix the source. –  May 14 '15 at 07:10
  • ASCII characters' range is 0-127. This character is at code point 65533, far outside ASCII range. – phuclv May 14 '15 at 07:29
  • 1
    How do you receive this string from the other application? Did you do any decoding from a byte array? Code is best. – weston May 14 '15 at 07:35
  • You could look at this way: � indicates where characters were removed. That can't be a good thing, especially if you don't know what the characters where. – Tom Blodget May 15 '15 at 03:14

2 Answers2

5

According to this Unicode code table, � (or \ufffd) is the character �.

You can remove this unicode character from your string with :

str = str.replaceAll("�", "");

But you should really try to understand why they are there.

Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252
  • if I try with this, then I am able to remove unwanted characters. `str.replaceAll("�","");` With `str.replaceAll("\ufffd", "");` it didn't work. – Jacob May 14 '15 at 07:17
  • @user75ponic : I could not test (no access to my dev box). Post edited – Serge Ballesta May 14 '15 at 07:27
1
string.replaceAll("\u0000.*","").replaceAll("[^a-zA-Z0-9 ]", "");

will remove the empty spaces & punctuation marks in the string variable.

Don Chakkappan
  • 7,397
  • 5
  • 44
  • 59