I have this variable String var = class.getSomething
that contains this url http://www.google.com§°§#[]|£%/^<>
.The output that comes out is this: http://www.google.comç°§#[]|£%/^<>
. How can i delete that Ã? Thanks!

- 1
- 2
-
1Well, why don't you just use an encoding that do it for you? Today it's a `Â` or `Ã`, but tomorrow it may be something else. – Turtle Sep 29 '17 at 08:30
-
As @Nathan said, its better to use some encoding format than replacing the characters. Because, in future it can be the case where you need the character `Ã` but it gets replaced. – Procrastinator Sep 29 '17 at 08:40
-
I don't think that modifying the variable `var` will solve your problem. You don't describe how you produce the output. It may be that the unwanted characters are a result of some misinterpretation of the string's encoding during outout, i.e., they aren't really in the string. So all the var.replace techniques proposed so far are useless. – laune Sep 29 '17 at 08:50
5 Answers
You could do this, it replaces any character for empty getting your purpouse.
str = str.replace("Â", "");
With that you will replace  for nothing, getting the result you want.

- 2,187
- 9
- 35
- 60
-
1
-
I thought was a typo. If you check input string versus output looks like he wants to replace everyone. – Kenzo_Gilead Sep 29 '17 at 08:32
-
Ha, on my answer below I had also put in the wrong symbol before I submitted :) – achAmháin Sep 29 '17 at 08:33
-
@EliasMP I think he wants to replace both -- just that he don't know it himself yet. – Turtle Sep 29 '17 at 08:33
specify the charset as UTF-8
to get rid of unwanted extra chars :
String var = class.getSomething;
var = new String(var.getBytes(),"UTF-8");

- 1,231
- 8
- 21
Do you really want to delete only that one character or all invalid characters? Otherwise you can check each character with CharacterUtils.isAsciiPrintable(char ch)
. However, according to RFC 3986 even fewer character are allowed in URLs (alphanumerics and "-_.+=!*'()~,:;/?$@&%", see Characters allowed in a URL).
In any case, you have to create a new String object (like with replace in the answer by Elias MP or putting valid characters one by one into a StringBuilder
and convert it to a String) as Strings are immutable in Java.

- 325
- 4
- 5
The string in var
is output using utf-8, which results in the byte sequence:
c2 a7 c2 b0 c2 a7 23 5b 5d 7c c2 a3 25 2f 5e 3c 3e
This happens to be the iso-8859-1 encoding of the characters as you see them:
§ ° §#[]| £%/^<>
ç°§#[]|£%/^<>
C2 is the encoding for Â.
I'm not sure how the à was produced; it's encoding is C3.
We need the full code to learn how this happened, and a description how the character encoding for text files on your system is configured.
Modifying the variable var
is useless.

- 31,114
- 3
- 29
- 42