Unknown character � after importing excel to MySQL, how to avoid it?

Question

Possible Duplicate:
Problem in utf-8 encoding PHP + MySQL

I've imported about 1000 records into MySQL from an excel file. But now I'm seeing � between some texts. It seems they were double quotes.

How can I avoid this while importing data?
Can I use str_replace() function to handle this issue while printing data in web page?

Have you checked the encoding of the page, the database, the database connection etc? Either way it's a dupe. — PeeHaa, Aug 15 '12 at 18:41
possible duplicate of [Problem in utf-8 encoding PHP + MySQL](http://stackoverflow.com/questions/1707792/problem-in-utf-8-encoding-php-mysql) or http://stackoverflow.com/questions/5287821/mysql-db-question-marks-instead-of-hebrew-characters or perhaps http://stackoverflow.com/questions/5445137/utf-8-encoded-html-pages-show-questions-marks-instead-of-characters — PeeHaa, Aug 15 '12 at 18:44
database collation is `utf8_general_ci`, but I imported data using Navicat IDE. So I just could set this collation, nothing more. — Mohammad Saberi, Aug 15 '12 at 18:51
How did you import that data. Show the code or the commandline. — hakre, Aug 15 '12 at 18:53
@hakra I used http://navicat.com/en/products/navicat_mysql/mysql_overview.html for it. I don't know what is its mechanism. — Mohammad Saberi, Aug 15 '12 at 19:19
I had this problem with some characters, using `utf8_encode` and `utf8_decode` would convert some of them to "database friendly characters" — William Isted, Aug 15 '12 at 19:24
@MohammadSaberi: If you used *Navicat for MySQL* then the first thing you should do is contact the vendor for your support options. We can not offer support here for proprietary products on a technical level. — hakre, Aug 15 '12 at 19:39

score 0 · Answer 1 · answered Aug 15 '12 at 18:48

0

Use preg_replace to do a regex replacement of all unrecognized characters.

Example:

$data = preg_replace("/[^a-zA-Z0-9]/", "", $data);

This example will replace all non alpha-numeric characters (anything that is not a-z, A-Z, 0-9).

http://php.net/manual/en/function.preg-replace.php

answered Aug 15 '12 at 18:48

trevorkavanaugh

340
1
8

Whatever the unknown characters were, yes. If it was double quotes before, he would lose them. This is simply in response to his question about using "str_replace". This would be a better alternative. – trevorkavanaugh Aug 15 '12 at 18:49

Brett Thomas · Answer 2 · 2012-08-15T19:01:00.003

0

str_replace('“', '"', $original_string);

there's a few characters word does this with, so you will want to probably also do: str_replace("‘", "'", $original_string);

if you see other characters causing the same issue, you can open up the doc in word, and copy/paste the offending character into your editor and do a similar replacement.

Since you are most likely looking to replace the character with an equivalent version, you probably do not want to do a regex like suggested in another answer. str_replace is faster than preg_replace for type of use.

edited Aug 15 '12 at 19:01

answered Aug 15 '12 at 18:54

Brett Thomas

161
3
7

1

Murphy's law assures that the *customer* will see some other characters causing this issue, not the *developer*. Normalizing the encodings seems like a better approach. – DCoder Aug 15 '12 at 18:58
Who's to say the developer is not the customer? He may also not have the ability to change the format of the excel files being imported. I've run across issues like this in the past, where files/data were being uploaded by others who were unwilling to make a change to their process. Plus he specifically asked about using str_replace. – Brett Thomas Aug 15 '12 at 19:08

score 0 · Answer 3 · answered Aug 15 '12 at 18:58

0

If your database is simple enough (no serialised values and no gigabytes in size), you could export it entirely (e.g. using PhpMyAdmin), open in a text editor, do search-replace and import it back.

answered Aug 15 '12 at 18:58

frnhr

12,354
9
63
90

Unknown character � after importing excel to MySQL, how to avoid it?

3 Answers3