1

We are porting our software from PHP to Java, using SpringBoot / JPA to store data. All our data currently is stored in MySQL using the latin1 collation (ISO-8859-1), and this can't be changed yet.

With the PHP software, storing UTF8 characters into the database tables just worked - characters not in UTF-8 simply was omitted (or replaced by "?" chars).

But if we want to store a strings containing UTF-8 chars via SpringBoot results in this exception:

2022-07-25 10:54:43.894  WARN 84143 --- [nio-8080-exec-7] o.h.engine.jdbc.spi.SqlExceptionHelper   : SQL Error: 1366, SQLState: 22001
2022-07-25 10:54:43.897 ERROR 84143 --- [nio-8080-exec-7] o.h.engine.jdbc.spi.SqlExceptionHelper   : Data truncation: Incorrect string value: '\xEF\x82\xA7;\x0A\x09...' for column `tanss`.`mails`.`body` at row 1

Is there any way to just "ignore" them before persisting, so that only latin1 chars are stored (i.e. setting in application.properties or MySQL server)?

--- UPDATE ---

I was able to fix it. In case someone else encounters the problem, using "latin1" collation: The MySQL drivers 8.0.29 no longer determines the character encoding by the default charset of the MySQL server, rather using "utf8mb4" by default. Adding &characterEncoding=latin1 to the jdbc url fixed the problem!

Fasty
  • 11
  • 3
  • At a glance, the mysql connector docs don't show any configs that describe what you want https://dev.mysql.com/doc/connector-j/8.0/en/connector-j-reference-configuration-properties.html. You would have to write code to do this yourself. – Taylor Jul 25 '22 at 13:26
  • You could try auto-registering an `AttributeConverter` with `convertToDatabaseColumn(String)` implemented so that the input gets converted as needed, see https://vladmihalcea.com/jpa-attributeconverter/. Not tested, not sure whether it could work, but worth a shot I guess... – sp00m Jul 25 '22 at 14:30
  • See "truncated" and "question mark" in https://stackoverflow.com/questions/38363566/trouble-with-utf8-characters-what-i-see-is-not-what-i-stored – Rick James Jul 26 '22 at 05:19
  • Is the goal to delete any non-ascii characters? That might result in an empty string. – Rick James Jul 26 '22 at 05:20
  • 1
    I was able to fix it. In case someone else encounters the problem, using "latin1" collation: The MySQL drivers 8.0.29 no longer determines the character encoding by the default charset of the MySQL server, rather using "utf8mb4" by default. Adding "&characterEncoding=latin1" to the jdbc url fixed the problem! – Fasty Jul 26 '22 at 13:23
  • @Fasty nice, feel free to answer your own question, it could serve others :) – sp00m Jul 27 '22 at 10:24

0 Answers0