0

I have a table with UTF8 charset. The columns are considered utf8_general_ci. I am reading the data using prepared statement but they are not shown correctly. The data inside the table is not readable too. I need to write a code in a way that they are human readable. I have tested many methods which all failed. For the Connection property I used "?useUnicode=true&characterEncoding=UTF8";

String city=resultset.getString("city");
byte[] data = city.getBytes();
String valueCity = new String(data, "UTF-8"); // Or  String valueCity = new String(data, StandardCharsets.UTF_8);

I see something like "&#21517 ; & #21476 ;& #23627; & #24066;" in my table but I need to read or write them like 名古屋市. Any suggestions that I may handle this problem which is a pain on my neck? thanks a million in advance

Shila Mosammami
  • 999
  • 6
  • 20

4 Answers4

1

Maybe it is resultset.getString("city") what is your problem here. You already receive the data as a string. The byte representation of that string is likely not utf-8. What's the type of resultset?

Are you sure you opened your database connection with characterEncoding=utf8? You need to set connectionProperties="useUnicode=yes;characterEncoding=utf8;"

Stackoverflow

Stephan
  • 42
  • 5
1

Something, not MySQL, is generating "html entities" such as 名. Find where those are coming from and undo it.

Since those entities are probably already stored in the table, that needs to be undone, too.

The html entities should render correctly in any browser. Are you trying to use them in some other context?

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • They were special characters which are not available in English alphabet, I could not delete them. I just needed to read them correctly. Anyway thanks for the response – Shila Mosammami Nov 18 '21 at 10:17
0

It might help to check the resultset.getBytes(..) instead of getString first

user3151610
  • 637
  • 1
  • 5
  • 12
0

Finally I found the code:

public static String unescapeXML( final String xml )
{
    Pattern xmlEntityRegex = Pattern.compile( "&(#?)([^;]+);" );
    // Matcher requires a StringBuffer instead of a StringBuilder
    StringBuffer unescapedOutput = new StringBuffer( xml.length() );

    Matcher m = xmlEntityRegex.matcher( xml );
    Map<String,String> builtinEntities = null;
    String entity;
    String hashmark;
    String ent;
    int code;
    while ( m.find() ) {
        ent = m.group(2);
        hashmark = m.group(1);
        if ( (hashmark != null) && (hashmark.length() > 0) ) {
            code = Integer.parseInt( ent );
            entity = Character.toString( (char) code );
        } else {
            //must be a non-numerical entity
            if ( builtinEntities == null ) {
                builtinEntities = buildBuiltinXMLEntityMap();
            }
            entity = builtinEntities.get( ent );
            if ( entity == null ) {
                //not a known entity - ignore it
                entity = "&" + ent + ';';
            }
        }
        m.appendReplacement( unescapedOutput, entity );
    }
    m.appendTail( unescapedOutput );

    return unescapedOutput.toString();
}

private static Map<String,String> buildBuiltinXMLEntityMap()
{
    Map<String,String> entities = new HashMap<String,String>(10);
    entities.put( "lt", "<" );
    entities.put( "gt", ">" );
    entities.put( "amp", "&" );
    entities.put( "apos", "'" );
    entities.put( "quot", "\"" );
    return entities;
}
Shila Mosammami
  • 999
  • 6
  • 20