3

I have a file.csv with 300,000 rows. Many of the rows are the names of foreign cities with accented characters. When I try to do an import in mysql, I get warnings on the fields, and it truncates the field at the point of the special character.

    LOAD DATA LOCAL INFILE '/var/tmp/geoip/location.csv' INTO TABLE Geolocation2 FIELDS TERMINATED BY ',' enclosed by '"' LINES TERMINATED BY '\n' (id, country, region, city, postalCode, latitude, longitude, metrocode, areacode );
Barry Chapman
  • 6,690
  • 3
  • 36
  • 64

5 Answers5

2

I had this problem with CSV files created by MS Excel. If you are using Excel, or need to convert CSV files to UTF-8 for importing into MySQL the answer here may be of help.

Community
  • 1
  • 1
Glenn Lawrence
  • 2,844
  • 1
  • 32
  • 38
2

Open and SaveAs all your SQL Query and Data Files with UTF-8 encoding

This will solve BULK INSERT problems use option WITH (DATAFILETYPE = 'widenative')

It will also solve INSERT INTO Problems whether the data is in the same file as the CREATE TABLE instruction or chained :r "X:\Path\InsertIntoMyTable.sql"

Naemoor
  • 39
  • 5
1

You need to set the connection, database, table and column encodings to the same character set as the data was saved in the CSV file.

http://dev.mysql.com/doc/refman/5.0/en/charset.html

Dan Grossman
  • 51,866
  • 10
  • 112
  • 101
0

This sequence works to me.

  • create database {$databasename} DEFAULT CHARACTER SET latin1;
  • ALTER DATABASE {$databasename} DEFAULT CHARACTER SET latin1 DEFAULT COLLATE latin1_swedish_ci;
  • charset latin1;
  • load data infile '{$file.csv}' into table {$tablename} character set latin1 fields terminated by '|' enclosed by '"' lines terminated by '\n';
josdem
  • 196
  • 2
  • 8
-1

Tell MySQL what the codepage of the source file is when you import it. e.g. to import a file with codepage Windows-1252, use MySQL codepage latin1 (which is the same thing) like this:

LOAD DATA LOCAL INFILE '/path/to/file.csv'
  INTO TABLE imported_table
  CHARACTER SET 'latin1'
  COLUMNS TERMINATED BY ','
  OPTIONALLY ENCLOSED BY '"'
  ESCAPED BY '"'
  LINES TERMINATED BY '\r\n'
  IGNORE 1 LINES;
webaware
  • 2,795
  • 2
  • 30
  • 37
  • Setting the charset to `utf8mb4` on import worked for me. `LOAD DATA LOCAL INFILE 'my.csv' INTO TABLE geo_data CHARACTER SET 'utf8mb4' FIELDS TERMINATED BY ...`. – turrican_34 Sep 15 '20 at 12:27