2

I've huge String array to display via using Textview. This String array contain some anonymous char like ’ , âˆ, Ã, −, ‘ etc. But it has no meaning in English please help me how to decode these char in actual meaning.

I've already use UTF-8 encoding but not work.

private final static Charset UTF8_CHARSET = Charset.forName("UTF-8");

public static String getUTF8Encoded(String targetString) {
    String resultant = "";
    try {
        return new String(encodeUTF8(targetString), UTF8_CHARSET);
    } catch (Exception e) {
        e.printStackTrace();
        return resultant;
    }
}

private static final byte[] encodeUTF8(String string) {
    return string.getBytes(UTF8_CHARSET);
}
Garg
  • 2,731
  • 2
  • 36
  • 47
  • Do these characters have corresponding Latin alphabets? For instance something similar to Cyrillic alphabets? – ishmaelMakitla Jun 18 '16 at 10:58
  • Possible duplicate of ["’" showing on page instead of " ' "](http://stackoverflow.com/questions/2477452/%c3%a2%e2%82%ac-showing-on-page-instead-of) – Lori Jun 18 '16 at 11:27

2 Answers2

0

Instead:

Charset.forName("UTF-8");

try with this:

Charset.forName("windows-1252");
Rolf ツ
  • 8,611
  • 6
  • 47
  • 72
Carlos Hernández Gil
  • 1,853
  • 2
  • 22
  • 30
-2

you can use Apache Commons Lang

org.apache.commons.lang3.StringUtils.stripAccents("Añ");

returns An.


another solution:

this function converts all accented characters into their deAccented counterparts followed by their combining diacritics. Now you can use a regex to strip off the diacritics.

import java.text.Normalizer;
import java.util.regex.Pattern;

public String deAccent(String str) {
String nfdNormalizedString = Normalizer.normalize(str, Normalizer.Form.NFD); 
Pattern pattern = Pattern.compile("\\p{InCombiningDiacriticalMarks}+");
return pattern.matcher(nfdNormalizedString).replaceAll("");
}
Patrick Trentin
  • 7,126
  • 3
  • 23
  • 40
benyob
  • 1
  • 2