I created a CMS in PHP using PDO/MySQL. On each page in the CMS I have the HTML charset entered as UTF-8:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
I have a couple of <textfield>
elements that the user can enter some descriptive text of the item into. This works well for strings that are typed out into the <textarea>
, however the client has a tendency to copy and paste from another document, by the looks of it. Every so often I see a � on the page, however, in the database, it's actually a hyphen, or an apostrophe, etc.
I know that it's an encoding issue, but what I'm not sure of is where the problem is. If I'm setting the page charset as UTF-8, should the content not save as UTF-8?
If I change the charset to Latin-1 (ISO 8859-1) the symbol appears correctly. I was under the impression that UTF-8 had a broader character range than Latin-1, which kind of adds to the confusion.
Is there a work around for this in PHP? Can I force a field in MySQL to be a specific charset?
I have checked a few Questions and Answers on SSO without much luck. Most of the answers are "Make sure you're declaring your charset", which I am.