0

I can't get my php/html pages to display characters with accents (é, á ) correctly. My database is set up correctly. When I run show variables like 'char% I get satisfactory output:

character_set_client        utf8
character_set_connection    utf8
character_set_database      utf8
character_set_filesystem    binary
character_set_results       utf8
character_set_server        utf8
character_set_system        utf8
character_sets_dir      /usr/share/mysql/charsets/

Data is encoded and stored correctly in the DB when I insert it from a form on my site.

But something weird happens on my display page.

If I put header('Content-Type: text/html; charset=utf-8'); at the top just inside the php opening tag, data from my DB is displayed correctly but characters that I type manually into the page within HTML tags (as Table Column Headers or just descriptive para text) display with rubbish: �.

If I REMOVE header('Content-Type: text/html; charset=utf-8'); from the top of the page, data from my DB is displayed INCORRECTLY but the text that I write into the page within HTML is fine.

I've tried everything I know. Has anyone come across this before?

Benjamin
  • 187
  • 3
  • 17
  • I had that same issue just yesterday actually. – Funk Forty Niner Dec 05 '17 at 15:35
  • Have you checked if your files are saved as`utf-8`? – M. Eriksson Dec 05 '17 at 15:39
  • Is your table you are pulling from, have the fields set as `utf8_unicode_ci` collation as well? We had an issue like this when converting an old iso table to utf-8, was a total mess. – IncredibleHat Dec 05 '17 at 15:42
  • @Magnus you mean the .php files? Should I check in Windows Notepad, using the “Save As…” option in the File menu? – Benjamin Dec 05 '17 at 15:43
  • @Randall, they're utf8_general_ci . Could that be a problem? – Benjamin Dec 05 '17 at 15:44
  • Sure, if that's your preferred code editor. Make sure you save them as UTF-8 and try again. I would recommend you to use a proper IDE, though. – M. Eriksson Dec 05 '17 at 15:45
  • `utf8_general_ci` should be fine as well. So long as its a utf8 variant. I just wanted to make sure the utf8 data wasnt being stuffed into an old iso column, where things 'can go wonky' when pulling data ;) – IncredibleHat Dec 05 '17 at 15:53
  • @Magnus the Save options I get are .rtf, .docx, .odt, .txt (MS-DOS) and .txt (Unicode text document). I always use the last. BUT THEN I rename them to .php so they'll function in my application. Is that a problem? – Benjamin Dec 05 '17 at 15:54
  • That sounds scary similar to WordPad and not Notepad. You can't save rtf and odt from notepad. In notepad, you can save a file as *.txt or as *.* under type and then you have another option "Encoding" that needs to be set to "UTF-8". But seriously, install a real editor like Sublime, Atom or similar. Using WordPad for coding is like using a used kleenex as a screw driver. – M. Eriksson Dec 05 '17 at 15:58
  • 1
    @Magnus, I got it! I was looking in the wrong place. BELOW the Save options there's the Encoding options and I hadn't selected utf-8 !! Thanks so much. It's displaying correctly now! – Benjamin Dec 05 '17 at 16:02
  • THAT was the problem ??? Hmm, I never realized changing the base encoding of a CODE file, would somehow affect data pulls from a database and sending them (as a variable) to html output. I know it makes a HUGE difference for hard-coded characters in the code (like in basic echo statements) ... but a variable output from a db??? Oh well! Glad its working now for you ;-) – IncredibleHat Dec 05 '17 at 16:07
  • @Randall If it makes a different for hard coded data, why wouldn't it make a difference not matter where the content comes from? It's all parsed into one single file that gets sent to the browser in the end (which is where it gets messed up). – M. Eriksson Dec 05 '17 at 16:09
  • Its just odd to wrap my mind around. That suggests to me PHP parser itself uses the base encoding of the .php code file, to encode data from $variables in echo/print statements before it sends it out the door. Kind of scary. I guess I never came across that, as every file I work with is utf8 (even back when the sites were not using utf8 lol!). – IncredibleHat Dec 05 '17 at 16:13
  • This fix is going to save me a lot of time in the future! I don't get how it all works in PHP but it seems to be a messy problem for PHP. In another project I got around a related problem with mb_convert_encoding( $row[DATA] "latin1", "UTF-8") but it didn't help today. – Benjamin Dec 05 '17 at 16:20
  • https://stackoverflow.com/questions/38363566/trouble-with-utf8-characters-what-i-see-is-not-what-i-stored – Rick James Dec 06 '17 at 00:48
  • Collation is irrelevant for _this problem_. – Rick James Dec 06 '17 at 00:52
  • @Hego - Do _not_ use any convert functions in PHP; it will very likely make things worse. – Rick James Dec 06 '17 at 00:54
  • @Rick James Thanks. Nice trouble-shooting list in that linked question. And I'll avoid encoding in future. I think the editor saving as utf-8 is what I've needed all along. I've been using Notepad which saves by default to utf-16, apparently. Anyway, it's poxy. I'll trade up editors. – Benjamin Dec 06 '17 at 15:26
  • Use utf-8, not utf-16. – Rick James Dec 06 '17 at 20:30
  • @ Magnus Eriksson You were right. I was using Wordpad, not Notepad. Sorry! But what about this: I looked at Phpstorm, Atom and Sublime. I got Sublime and put my scripts into it and encoded to UTF-8 and it mangled my í characters. Turned brí into br?/th> !! Not just outputted rubbish, but actually broke my html tags! – Benjamin Dec 07 '17 at 15:44

2 Answers2

0

The only thing I can think of in addition to what you've done on the server side is to add:

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

To the <head> tag in your html file.

Adam Thomason
  • 1,118
  • 11
  • 23
-1

Try changing your file encoding to UTF-8. Notepad++, SciTE, PhpStorm and many others can do it.