-1

I'm developing a project for my university. I'm having trouble while trying to convert a byte array to a string.

So, first I'm saving the contents of a java file in a long blob type column in MySql like this:

Path path = Paths.get(file.getAbsolutePath());
Charset charset = StandardCharsets.UTF_8;
String fileContent = new String(Files.readAllBytes(path), charset);
...
String query = "INSERT INTO MY_TABLE(FILE_CONTENT) VALUES (?)";
PreparedStatement preparedStmt = conn.prepareStatement(query);
preparedStmt.setString(1, fileContent);
preparedStmt.executeUpdate();

The file contains some words with Spanish accents, which are inserted in the database correctly (as shown in the image): data saved in MySql

Later, in my application I need to read the information from this column. I did like this:

String query = "SELECT * FROM MY_TABLE";
Statement st = conn.createStatement();
ResultSet rs = st.executeQuery(query);
while (rs.next())
{
  Blob file= rs.getBlob("FILE_CONTENT");
  String fileContent= new String(file.getBytes(1l, (int) file.length()), StandardCharsets.UTF_8);

When I print fileContent the character é is not recognized. I get a text with something like this: driver.findElement(By.id("FNAME")).sendKeys("Leonardo Pa�z")

I have tried what is suggested in these posts (encoding decoding of byte array to string without data loss), (Encode String to UTF-8), (Java Strings Character Encoding - For French - Dutch Locales) and some blogs like this one, but I haven't been able to find a solution to this problem.

Any help will be really appreciated :)

  • Can you provide the HEX of `é`. If that is `E9`, you have latin1; if it is `C3A9`, you have correctly formatted utf8. – Rick James Jan 12 '18 at 18:15

1 Answers1

0

You say FILE_CONTENT is a blob, yet you use setString() to set it's data. From the javadocs:

Note: The setter methods (setShort, setString, and so on) for setting IN parameter values must specify types that are compatible with the defined SQL type of the input parameter. For instance, if the IN parameter has SQL type INTEGER, then the method setInt should be used.

Since your type is a blob, you must use setBlob:

try (InputStream fileContent = Files.newInputStream(path))
{
    preparedStmt.setBlob(1, fileContent);
}

Another way is to change the SQL type to one of the TEXT types, e.g. LONGTEXT (if you expect big files). BLOB doesn't really make much sense for text. Then you can simply use setString() and getString():

rs.getString("FILE_CONTENT");

You'd also want to change the database/table/column encoding to UTF-8.

To explain a bit further:

When you provide the String to your database using setString(), neither Java, nor your database know anything about the encoding of the original file. The database stores the characters using it's default encoding, probably ISO_8859_1. Since ISO_8859_1 has the character é, you can see it fine in your MySQL workbench. Then when you get the data using getBlob() and decode it using UTF-8, the é character gets lost, because the text isn't stored as UTF-8.

Max Vollmer
  • 8,412
  • 9
  • 28
  • 43