0

I was trying to determine an error in a java program that loads MySQL tables every night.

Error in the log was java.sql.SQLException: Incorrect string value: '\xEF\xBF\xBD\xEF\xBF\xBD...' for column 'manager' at row 1.

Finally determined there was a new name in the data (loading from a flat file) - FRANÇOIS - and it was the cedilla that was giving the error. Program still loaded everything, just left that field blank.

When I ran a SHOW FULL COLUMNS FOR tablename, it was latin1_swedish_ci. I know very little about collation, charsets. What should I change the collation to in order for it to accept this?

kenfire
  • 1,305
  • 12
  • 23
BigRedEO
  • 807
  • 4
  • 13
  • 33

1 Answers1

0

(To long for a comment)

Need to see more details.

Don't use latin1; use utf8.

Connect with ?useUnicode=yes&characterEncoding=UTF-8 in the getConnection() call

Use CHARACTER SET utf8 in the table and/or column definition. Please provide SHOW CREATE TABLE for confirmation.

EFBFBD is the "replacement" character, implying that you had garbage coming it.

Loading a flat file -- Can you get the hex of Ç from the file? If it is C7 it is latin1 and you should specify latin1 on the load. Is it LOAD DATA? Or something else?

If it is C387 then it is utf8; good.

More discussion, debugging, best practice, etc: Trouble with utf8 characters; what I see is not what I stored

Terminology: "Collation" (eg, latin1_swedish_ci) refers to sort order. Your problem is with "Character set" (eg, latin1 or utf8).

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • The SHOW CREATE TABLE statement shows it used the Default Charset, which is latin1. I will have to verify whether I am allowed access to change that table to uft8, or if there was a particular reason the default is latin1. – BigRedEO Sep 01 '17 at 18:13
  • It is ancient history. The latest version of MySQL defaults to utf8mb4. Still, you can explicitly specify `CHARACTER SET ...` on each `VARCHAR`. – Rick James Sep 01 '17 at 18:22
  • How do a change the char set for a specific column? That would be best solution so it doesn't cause problems with any other programs that might be using that same table. – BigRedEO Sep 01 '17 at 18:23
  • Trouble ensues if you are `JOINing` on that column. But, let's back up a bit. Please show all the charset details and connection details. We _may_ be able to solve your problem without changing the table/col charset. – Rick James Sep 01 '17 at 18:27
  • Don't know that I have access to the connection details - I did not create the java program. It is called by a shell script. – BigRedEO Sep 01 '17 at 18:29
  • Here are the charset details for the database - character_set_client | latin1 character_set_connection | latin1 character_set_database | utf8 character_set_filesystem | utf8 character_set_results | latin1 character_set_server | utf8 character_set_system | utf8 character_sets_dir | /usr/share/mysql/charsets/ – BigRedEO Sep 01 '17 at 18:30