This question concerns a Tomcat 7 web application, which is connected to a MySQL (5.5.16) database.
When I open a zip
file, That has filenames encoded in windows-1252
charset, the characters seem to be interpreted correctly by Java:
ZipFile zf = new ZipFile( zipFile, Charset.forName( "windows-1252" ) );
Enumeration entries = zf.entries();
while( entries.hasMoreElements() ) {
ZipEntry ze = ( ZipEntry ) entries.nextElement();
if( ! ze.isDirectory() ) {
String name = ze.getName();
System.out.println( name ); //prints correct filenames, e.g. café.pdf
}
}
Omitting the Charset object in the ZipFile constructor would cause an exception. The filenames in the zip file are printed correctly to standard output, including diacritics. But, when I subsequently try to store the filename in a database, the e-acute is replaced with a question mark (as seen with the mysql console client). I had no problems inserting special characters from the web application into MySQL before.
When I execute an INSERT with é
in Java source code:
statement.executeUpdate( "insert into files (filename) values ('café.pdf')" );
the é
shows up well in MySQL.
Also, my log file shows a comma instead of é: caf‚.pfd
Does anyone know what could be happening here?