0

Tried searching for this question but I think I don't know the jargon. I am entering my site content into a mysql database using php, but all of my accented letters or apostrophes (spanish) get transformed into some crazy encoding. Ex:

' becomes â€

á becomes á

and etc. First off I don't know what this means or is, but when displaying them on my site, they definitely do not revert back, nor would I expect them too because if I manually enter in UTF-8 letters they totally work on my site.

Is there a way to fix this without re-entering all of my text? I have a feeling I can extract them using php, decode them and then insert them back in but I do not know the functions that do this. The best solution, and if anyone knows how that would be amazing, would be to just do it within sql. By the way the columns say collation utf8_general_ci.

For some further information, I am not doing anything to the text that gets entered into the database (I know that is bad but I suck at this stuff!) Also I am not doing anything when it is being queried. My functions insert pure text and extract pure text to each page. In this way I can write html into my forms and it appears as html on the page and therefore the browser interprets it correctly. I also have a feeling this is really sloppy but like I said... Thanks for all the help!

-- Edit --

So thanks to people who pointed me to the other questions. However, the way I fixed it was not in that answer, it just gave me the write keywords to start a new search. For anyone who has this problem, the way I fixed it was using the function utf8_decode(). I'm not sure this is a great fix, but at least it is working for now and speed was my biggest priority. I am certain the core problem is in how I am entering the data into the database.

idelest
  • 111
  • 1
  • 9

1 Answers1

1

You can make sure the character encoding for INSERT queries are correct at runtime using $mysqli->set_charset("utf8")

It's also prudent to make sure that PHP is sending the correct HTTP response headers, and the Browser is doing the right thing:

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

ini_set('default_charset', 'utf-8')

To alter your tables, do ALTER TABLE myTable CHARACTER SET utf8 COLLATE utf8_general_ci;

Alan Kael Ball
  • 680
  • 6
  • 17
  • I have the meta information set. But I have not done ini_set. But how could this work if the characters are stored incorrectly in the database? Won't php just show them as they are? I think my problem is the insert, like you said I need to set the charset to UTF-8. But this will take a long long time to fix seeing as I would have to correct the accents and then reinsert all of my pages – idelest Jan 28 '14 at 14:42
  • What's the output like? Could you use Multibyte String Functions to translate? eg `mb_convert_encoding()` – Alan Kael Ball Jan 29 '14 at 10:47