i have a MySQL database having standard encoding and server encoding all set as utf8.I have csv files coming in of multiple encoding which I have to load in the database using jdbc. But when the incoming file is of encoding ANSII, load data infile fails
java.sql.SQLException: Invalid utf8 character string: '1080'
I am creating a table table_abc
based on csv headers and then using the below query to load the csv file into database
LOAD DATA LOCAL INFILE 'XXX.csv' INTO TABLE table_abc CHARACTER SET UTF8 FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\n' IGNORE 1 LINES
Here is my DB definition
character_set_client utf8
character_set_connection utf8
character_set_database utf8
character_set_filesystem binary
character_set_results utf8
character_set_server utf8
character_set_system utf8
character_sets_dir C:\Program Files\MySQL\MySQL Server 5.7\share\charsets\
What should I do now,
- Should i convert all files to utf8 before uploading? if yes then how in Java
- Should I have multiple encoded tables for multiple encoded files? If yes, then how do i detect encoding of incoming file in java?
P.S I have no issues in missing out non-utf8 characters while loading in the table, my only intention is the sucessful upload of the file in the DB without giving any error irrespective of encoding.
Thanks