0

I have this variable String var = class.getSomething that contains this url http://www.google.com§°§#[]|£%/^<> .The output that comes out is this: http://www.google.comç°§#[]|£%/^<>. How can i delete that Ã? Thanks!

Francesco
  • 1
  • 2
  • 1
    Well, why don't you just use an encoding that do it for you? Today it's a `Â` or `Ã`, but tomorrow it may be something else. – Turtle Sep 29 '17 at 08:30
  • As @Nathan said, its better to use some encoding format than replacing the characters. Because, in future it can be the case where you need the character `Ã` but it gets replaced. – Procrastinator Sep 29 '17 at 08:40
  • I don't think that modifying the variable `var` will solve your problem. You don't describe how you produce the output. It may be that the unwanted characters are a result of some misinterpretation of the string's encoding during outout, i.e., they aren't really in the string. So all the var.replace techniques proposed so far are useless. – laune Sep 29 '17 at 08:50

5 Answers5

1

You could do this, it replaces any character for empty getting your purpouse.

str = str.replace("Â", "");

With that you will replace  for nothing, getting the result you want.

Kenzo_Gilead
  • 2,187
  • 9
  • 35
  • 60
0

Use String.replace

var = var.replace("Ã", "");
achAmháin
  • 4,176
  • 4
  • 17
  • 40
0

specify the charset as UTF-8 to get rid of unwanted extra chars :

    String var = class.getSomething; 
    var = new String(var.getBytes(),"UTF-8");
Mustapha Belmokhtar
  • 1,231
  • 8
  • 21
0

Do you really want to delete only that one character or all invalid characters? Otherwise you can check each character with CharacterUtils.isAsciiPrintable(char ch). However, according to RFC 3986 even fewer character are allowed in URLs (alphanumerics and "-_.+=!*'()~,:;/?$@&%", see Characters allowed in a URL).

In any case, you have to create a new String object (like with replace in the answer by Elias MP or putting valid characters one by one into a StringBuilder and convert it to a String) as Strings are immutable in Java.

Lars
  • 325
  • 4
  • 5
0

The string in var is output using utf-8, which results in the byte sequence:

c2 a7 c2 b0 c2 a7 23 5b 5d 7c c2 a3 25 2f 5e 3c 3e

This happens to be the iso-8859-1 encoding of the characters as you see them:

 § ° §#[]| £%/^<>
ç°§#[]|£%/^<>

C2 is the encoding for Â.

I'm not sure how the à was produced; it's encoding is C3.

We need the full code to learn how this happened, and a description how the character encoding for text files on your system is configured.

Modifying the variable var is useless.

laune
  • 31,114
  • 3
  • 29
  • 42