0

I have a few textfiles which are input for a MySQL database. These textfiles contain characters like é and ë. I have struggled getting the data properly into the database and now it seems I've finally got it right. However, I would like to know if there is a better way to do this than the way I describe here.

  1. The textfiles are all UTF-8 encoded.
  2. The PHP scripts are all UTF-8 encoded as well. I've read that this is very important.
  3. All HTML output is done using a header like this: <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
  4. The MySQL database is created using a collation of latin1_swedish_ci (the character set is left blank)
  5. All the columns that contain characters (VARCHAR) are defined using a collation of latin1_swedish_ci

I assume the right way to store url encoded strings is when I see the character é stored as %C3%A9 in the database. I found a MySQL function for urlencoding here. But when I open up phpMyAdmin I see the character é is presented as %C3%A3%C2%A9.

I can add another statement to replace characters in the database, but something tells me there is a more efficient way to achieve this.

Any help is greatly appreciated. Thanks in advance.

Raoul
  • 101
  • 9
  • You should use `urlencode` and `urldecode` for generating links, not for data storage. Update your database to use UTF8 as well. – chris85 Nov 05 '16 at 15:52
  • Possible duplicate of [UTF-8 all the way through](http://stackoverflow.com/questions/279170/utf-8-all-the-way-through) – chris85 Nov 05 '16 at 15:52
  • Could you please add the source code where you read the source textfile, convert (urlencode) and the SQL of insertion ? I'd say to use `ascii_bin` as character set collation in mysql but i may be wrong – Proger_Cbsk Nov 05 '16 at 17:06
  • The question indicated by chris85 was really helpful. I got it all working now! Many thanks. – Raoul Nov 11 '16 at 14:14

1 Answers1

0

What is missing from your list of 5 things is

  1. I tell mysql that the client bytes are utf8-encoded. I do this via $mysqli_obj->set_charset('utf8'); or new PDO('dblib:host=host;dbname=db;charset=UTF8', $user, $pwd); or SET NAMES utf8. (or utf8mb4).

The client sees utf8, the table sees latin1; the conversion will occur when INSERTing and SELECTing, but it needs #6 to know to do so.

Rick James
  • 135,179
  • 13
  • 127
  • 222