1

I have a website with very simple news system (posting, editting, deleting etc). All my html pages are saved in UTF-8 formatting, everything displayes correctly.

I specify using UTF in every header:

For saving news to database, I use simple scripts like (all values come from a html form):

   $newsTitel   = isset($_POST['title']) ? $_POST['title'] : 'Untitled';
   $submitDate  = $date = date('Y/m/d');
   $content = isset($_POST['newstext']) ? $_POST['newstext'] : 'No content';

   include 'includes/dbconnect.php';

   mysql_query("SET CHARACTER SET utf8");
   mysql_query("SET NAMES 'utf8'"); 
   $query = mysql_query("INSERT INTO news SET date='$submitDate',subject='$newsTitel',news='$content'");

The data get saved to database but in a weird format (coding). There are characters like à ¡ Ä etc which makes the content almost unreadable. Other problem is that when loading this content back to html forms (for editting news) it displays in this weird coding. When I looked into the specification of the database I use, it says that it saves data in UTF-8.

I use phpMyAdmin to access the MYSQL database.

So to sum it up: Pages: saved in UTF8, all have correct header Database: interaction with the server: utf8_czech_ci, tables in the same format

What I do not understand at all is this strange bevaior: 1) I save the data into the database using the script above 2) I take a look into phpMyAdmin and see broken encoding 3) I load the data back into my website and display them using this:

<?php
        include 'includes/dbconnect.php';
        $data = mysql_query("SELECT * FROM news ORDER BY id DESC limit 20") or die(mysql_error()); 

        while($info = mysql_fetch_array( $data )) 
        {
            echo '<article><h3> '.$info['subject'].'</h3><div id="date">'.$info['date'].'</div>';
            echo '<p>'.$info['news']. '</p></article>';
        } 
 ?>

The encoding is correct and no weird characters are displayed.

4) I load the exact same data into a html form (for edition purposes) and see the same broken encoding as in the database.

What happened? I really dont get it. I tried fixing this by re-saving everything in utf8, alterign tables and changing their encodings into different utf8 versions etc...

This is example of a data I pass to the database (it is in czech with html tags):

<p>Vařila myšička kašičku</p>
<img src="someImage.jpg">
<p>Další text</p>

Thanks for any help...

Smajl
  • 7,555
  • 29
  • 108
  • 179
  • possible duplicate of [UTF-8 all the way through](http://stackoverflow.com/questions/279170/utf-8-all-the-way-through) – deceze Dec 07 '13 at 16:40

2 Answers2

2

The commands for specifying the character set should be:

set names 'utf8';

If you check the result returned from your queries at the moment, what does it say? If I try it in the monitor I get the following:

mysql> set names 'UTF-8';
ERROR 1115 (42000): Unknown character set: 'UTF-8'

Have you tried using set names 'utf8' before connecting for the SELECT as well? The characters you're saying are output make me think you're getting back the correct bytes for UTF-8, but they're being interpreted as ISO-8859-1.

chooban
  • 9,018
  • 2
  • 20
  • 36
  • Yeah, I see the same error now... The command was not there before (it was part of my effort to solve the problem... – Smajl Dec 07 '13 at 16:06
  • Okey, I just corrected this command - now data in my db are saved correctly and the data loaded back are wrongly encoded... what now? – Smajl Dec 07 '13 at 16:09
  • Thanks man, I will try to put this command before every query and will let you know how it turned out... but this seems like the root of my problem :-) – Smajl Dec 07 '13 at 16:15
  • Great! I'm sure there will be a global option somewhere, but PHP isn't my forte. – chooban Dec 07 '13 at 16:19
  • Great, it works (havent tried with the html form yet, but I guess it will be okay... :-) small mistakes are the worst ones xD – Smajl Dec 07 '13 at 16:22
  • When you start seeing those capital A's with accents, it's often UTF-8 bytes being interpreted as Latin-1. In my experience, at any rate. – chooban Dec 07 '13 at 16:25
0

You are not escaping single quotes or some other html chars. Use mysql_real_escape_string.

$newsTitel   = isset($_POST['title']) ? mysql_real_escape_string($_POST['title']) : 'Untitled';
drj
  • 27
  • 7
  • nice tip but thats not the problem... when I load it and display it with echo or rewrite it directly in my database to some "readable" text, it displays incorrectly in mz webpage... – Smajl Nov 03 '13 at 22:41