2

I'm trying to import a CSV file into a mysql table. The entire database has a utf8_unicode_ci encoding. The CSV file also has an UTF8 encoding. But when I try to run the following it just shows blank characters in the table instead of the special character:

$query = "LOAD DATA LOCAL INFILE 'mycsvfile.csv' INTO TABLE location_temp 
          CHARACTER SET UTF8
          FIELDS TERMINATED BY ';'
          ENCLOSED BY '\"'
          LINES TERMINATED BY '\r\n'";
$parameters = array ();
$connection->QueryExec ( $query, $parameters ); //Connection class

In the table above I only have one column location, the content of the csv file is: "ä";

I've also tried to change the value of the csv to "testthät"; in the table then comes: testth it just skips after the special character.

EDIT: This is how I create my database connection:

try {
    $this->conn = new PDO ( "mysql:host=" . $host . ";charset=utf8;dbname=" . $database, $user, $pass );
    $this->conn->exec ( "set names utf8" );
} catch ( PDOException $e ) {
    echo "Something went wrong while connecting to the database: " . $e->getMessage ();
}

Thanks in advance

Daan
  • 12,099
  • 6
  • 34
  • 51
  • I'd start troubleshooting by firing up the mysql command line and writing an insert that includes that character, then try to select it and make sure you get it back. If that works, I'd start looking at PHP encoding settings. http://stackoverflow.com/questions/279170/utf-8-all-the-way-through – Matt Mar 11 '15 at 21:22
  • @mkaatman When I do an insert query into the database and an select it works just fine. I've forgot to mention that. Thanks for the link. – Daan Mar 11 '15 at 21:25
  • In addition: Is you collation of the mysql table correct and your file correct encoded? (like in: please double check) – frlan Mar 11 '15 at 21:26
  • @frlan Yes file is encoded and the table too. – Daan Mar 11 '15 at 21:26
  • 2
    “The CSV file also has an UTF8 encoding”—are you sure? It doesn't look like it does. CSV files exported from Excel aren't generally UTF-8. Also, are you sure your field terminator is `;`? That doesn't sound like CSV. – bobince Mar 11 '15 at 23:18
  • Truncated utf8 string usually indicates that it was _not_ utf8-encoded. – Rick James Mar 12 '15 at 02:21
  • @bobince The CSV file is indeed CP1252 first but I encoded it to UTF8. I'm sure the field terminator is `;`. – Daan Mar 12 '15 at 07:38
  • Dunno if it's anyhow related, but if you write into the file using [fwrite()](http://php.net/manual/en/function.fwrite.php) function, you NEED to wrap it in `"` NOT `'`. – Eda190 Mar 14 '15 at 15:53
  • 1
    OK, not utf8 -- Please Edit the question to show that. And show some sample data. – Rick James Mar 15 '15 at 18:39

1 Answers1

1

The truncation indicates that it is not utf8. You say it is cp1252? Check SHOW COLLATION LIKE 'CP125%'; If you don't see cp1252, then latin1 is probably what you need.

Change the LOAD ... to specify cp1252 (or latin1) and in the dsn for the connection via PDO. (That is better than using SET NAMES.)

The principles:

  • The client (PHP) is getting latin1 encoding for the data.
  • The client must inform the MySQL server what encoding the data is in. (dsn)
  • The chatter between the client and server will translate the bytes if needed. This could happen if your column(s) say utf8 while the client is latin1. This inconsistency is ok.

More discussion

Rick James
  • 135,179
  • 13
  • 127
  • 222