14

How can I stop mysql from converting ' into ’ when I do an insert?

i believe it has something to do with charset or something?

I am using php to do the mysql_insert.

Hailwood
  • 89,623
  • 107
  • 270
  • 423
  • 1
    This has to do with charsets. How do you insert/retrieve data - i.e. which library and programming language (or probably ORM) are you using? – AndiDog Oct 04 '10 at 10:48
  • What charset is your database ? – RobertPitt Oct 04 '10 at 10:48
  • 1
    Yes, it has to do with the charset. Check charset of the table and in your connection string. Use correct encoding or unicode/utf. – codymanix Oct 04 '10 at 10:49
  • 1
    I should write a bot that finds "`x` turns `a` into `b`" type questions and gives a generic "you're using UTF-8 but your stuff is interpreting it as Latin-1" answer and makes billions of rep... – David Johnstone Oct 04 '10 at 11:19

3 Answers3

18

The single quotation mark you posted is called an 'acute accent', which is often converted from the generic single quotation mark by some web applications. It's a UTF8 character, which when inserted into a Latin-1 database translates to '’'. This means that you need to change MySQL's charset to UTF8, or alternatively change your website's charset to Latin-1. The former would be preferred:

ALTER DATABASE YourDatabase CHARACTER SET utf8;
ALTER TABLE YourTableOne CONVERT TO CHARACTER SET utf8;
ALTER TABLE YourTableTwo CONVERT TO CHARACTER SET utf8;
...
ALTER TABLE YourTableN CONVERT TO CHARACTER SET utf8;
Ashley Williams
  • 6,770
  • 4
  • 35
  • 42
  • 1
    I did this and now the table is utf8_general_ci but I am still getting the same issue? – Hailwood Oct 04 '10 at 12:05
  • 1
    @Hailwood: You should check the connection encoding as well. – AndiDog Oct 04 '10 at 12:12
  • I tried doing ` mysql_set_charset( 'utf8_general_ci' ,$this->link ); But it makes no difference – Hailwood Oct 04 '10 at 12:19
  • @Hailwood: Then you should post some code, or even better, a small snippet to reproduce the problem. – AndiDog Oct 04 '10 at 12:29
  • is there a way to query for this specific `'†` or this `â. `character? when I do a select with like I am getting all matches to `a` . – awm Mar 03 '22 at 14:21
  • @Hailwood @ awm, in case you haven't been able to fix this yet. Ashley's solution doesn't seem to be converting existing data to UTF8. You will need to run the below Query for all the columns of all the tables which converts latin1 to binary and then Utf8 (or something like that). This worked for me. `UPDATE table_name SET column_name = CONVERT(CAST(CONVERT(column_name USING latin1) AS BINARY) USING utf8);` – Mayur Chauhan Jul 19 '23 at 02:38
1

This is what I've done, and it worked for me:

  1. First make sure that column containing ' is utf8_general_ci

  2. Then add the mysql_set_charset to your code

    $db=mysql_connect("localhost", $your_username, $your_password);
    mysql_set_charset('utf8',$db);
    mysql_select_db($your_db_name, $db);
    
Ehsan
  • 1,022
  • 11
  • 20
1

Maybe someone will know the answer immediately, but I don't. However here are a few suggestions on what to examine (and possibly expand the question on)

When dealing with encodings and escaping you should include the full history of data

  • how was it created
  • what happened to it before the problem (did it have to go through backup, e-mail, was it created on a different server, OS, etc..; if it was transferred then was it as text file?)

The above is because anything that writes to a text file (browser, mysql client, web server, php application, to name a few layers that could have done it) can mess up character coding.

To troubleshoot, you can start eliminating, and thus the first step (in my book), is to

  1. connect to mysql server using mysql command line client.
  2. check the output of SHOW VARIABLES LIKE 'character_set%'
    (so even in this simple environment you have 7 values that can influence how the data is parsed, stored and/or displayed
  3. inspect SHOW CREATE TABLE TableName, and look for charset and collation info, both default for the table and explicit definition on columns

Having said all of the above, I don't think any western script would transcode a single quote character. So you might need to look at your escaping and other data processing.

EDIT Most of the above from answer and discussion here

Community
  • 1
  • 1
Unreason
  • 12,556
  • 2
  • 34
  • 50