1

I'm trying to insert web addresses to my database that contain scandic letters, for example:

ÄÖäöÅå

I'm using:

  • Opensuse 13.2 64bit Linux and MariaDB.
  • MySQL Server version: 5.5.44-MariaDB openSUSE package
  • PHP Version is 5.4.20

When I try to insert, I get this error message:

Incorrect string value: '\xC4HK\xD6.

This query confirms that the character set and collation is set correctly:

if (mysql_query("SET NAMES utf8mb4 COLLATE utf8mb4_unicode_ci")) {
    echo "Character set OK !";
}

My MySQL query works for everything except URLs that contain scandic letters:

if (mysql_query("INSERT INTO `table` (`address`) VALUES ('$URL')")){
    $insertCount++;
    echo "<br> insertcount = ".$insertCount."<br>";
} else {
    echo "MySQLerror = ".mysql_error()."<br>"; // Show MySQLerror

This is MySQL info from MariaDB, showing that everything is set to utf8mb4:

MariaDB [(none)]> SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';

+--------------------------+--------------------+
| Variable_name            | Value              |
+--------------------------+--------------------+
| character_set_client     | utf8mb4            |
| character_set_connection | utf8mb4            |
| character_set_database   | utf8mb4            |
| character_set_filesystem | binary             |
| character_set_results    | utf8mb4            |
| character_set_server     | utf8mb4            |
| character_set_system     | utf8               |
| collation_connection     | utf8mb4_unicode_ci |
| collation_database       | utf8mb4_unicode_ci |
| collation_server         | utf8mb4_unicode_ci |
+--------------------------+--------------------+
10 rows in set (0,00 sec)

How can I correctly insert scandic letters?


Edit

@Monty: These are my database settings:

MariaDB [(none)]> show variables like '%colla%';
+----------------------+--------------------+
| Variable_name        | Value              |
+----------------------+--------------------+
| collation_connection | utf8mb4_unicode_ci |
| collation_database   | utf8mb4_unicode_ci |
| collation_server     | utf8mb4_unicode_ci |
+----------------------+--------------------+
3 rows in set (0,00 sec)

MariaDB [(none)]> show variables like '%charac%';
+--------------------------+------------------------------+
| Variable_name            | Value                        |
+--------------------------+------------------------------+
| character_set_client     | utf8mb4                      |
| character_set_connection | utf8mb4                      |
| character_set_database   | utf8mb4                      |
| character_set_filesystem | binary                       |
| character_set_results    | utf8mb4                      |
| character_set_server     | utf8mb4                      |
| character_set_system     | utf8                         |
| character_sets_dir       | /usr/share/mariadb/charsets/ |
+--------------------------+------------------------------+
8 rows in set (0,00 sec)

MariaDB [(none)]> 

Edit

@Rick James: This what I got back :

MariaDB [db]> SHOW CREATE TABLE table; +--------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Table | Create Table | +--------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | table | CREATE TABLE table ( addr varchar(150) COLLATE utf8mb4_unicode_ci NOT NULL, PRIMARY KEY (addr), UNIQUE KEY addr (addr) ) ENGINE=MyISAM DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci COMMENT='List' | +--------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0,00 sec)

MariaDB [db]>

Sparky
  • 57
  • 9
  • 1
    Have a look through http://stackoverflow.com/questions/279170/utf-8-all-the-way-through – Funk Forty Niner Sep 17 '15 at 16:35
  • If you can, you should [stop using `mysql_*` functions](http://stackoverflow.com/questions/12859942/why-shouldnt-i-use-mysql-functions-in-php). They are [officially deprecated](https://wiki.php.net/rfc/mysql_deprecation). [These extensions](http://php.net/manual/en/migration70.removed-exts-sapis.php) have been removed in PHP 7. Learn about [prepared](http://en.wikipedia.org/wiki/Prepared_statement) [statements](http://php.net/manual/en/pdo.prepared-statements.php) instead, and consider using PDO, [it's really not hard](http://jayblanchard.net/demystifying_php_pdo.html). – Jay Blanchard Sep 17 '15 at 16:46
  • [Your script is at risk for SQL Injection Attacks.](http://stackoverflow.com/questions/60174/how-can-i-prevent-sql-injection-in-php) – Jay Blanchard Sep 17 '15 at 16:46
  • 2
    +Jay Blankhard I have not got that far yet that i can prevent SQL injections, but i will do that as soon as i get this thing solved. Thanks mate. – Sparky Sep 17 '15 at 17:02
  • Have you verified that your PHP file is saved as utf-8? – JoSSte Sep 18 '15 at 07:57
  • Mate, sorry for the late answer, but it was. I'm in the whole new level with my script now and I have all of you here to thank for it mates :))) – Sparky Dec 26 '15 at 21:16
  • Mates I just hope that I can someday help someone even close as much as all of You have helped me with my PHP project – Sparky Dec 26 '15 at 21:30

2 Answers2

0

Try this

Verify that the tables where the data is stored have the utf8 character set:

SELECT
  `tables`.`TABLE_NAME`,
  `collations`.`character_set_name`
FROM
  `information_schema`.`TABLES` AS `tables`,
  `information_schema`.`COLLATION_CHARACTER_SET_APPLICABILITY` AS `collations`
WHERE
  `tables`.`table_schema` = DATABASE()
  AND `collations`.`collation_name` = `tables`.`table_collation`
;

check your database settings:

show variables like '%colla%';
show variables like '%charac%';

Change utf-8 to utf8_general_ci

ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;
Monty
  • 1,110
  • 7
  • 15
  • I most humbly thank you for your answer, mate, but i used utf8 before i converted everything to utf8mb4 and it did not work either – Sparky Sep 17 '15 at 17:07
  • I tried to save 'ÄÖäöÅå' in mysql version 5.0.11-dev. Its inserting. If you are using PHP then you can use htmlspecialchars(); just before inserting into db. Before render use html_entity_decode(); OR you can use mysql_real_escape_string() – Monty Sep 17 '15 at 17:59
  • Ok thanks a lot mate I'll try it and i'll let you know how it goes – Sparky Sep 17 '15 at 18:40
  • I tried $URL = htmlspecialchars($URL); just before inserting and it did render $URL empty when there was scandic letter(s) included. – Sparky Sep 18 '15 at 07:04
0

C4 and D6 are latin1 hex for Ä and Ö.

Please do SHOW CREATE TABLE to see what CHARACTER SET is set for the column in question. I suspect it is incorrectly latin1.

And, yes, you must switch away from mysql_* interface.

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • Looks like there is not any character set set for that column, only collation. Please see my edit. Is it possible that this behavior is caused by that ? – Sparky Sep 18 '15 at 07:40
  • The `CHARACTER SET` is inherited from the table, namely `utf8mb4`. Now to scratch my head -- you have it correct, yet there is still a problem. – Rick James Sep 18 '15 at 15:44
  • mysql_* does not support charsets; mysqli_* does. [Reference](http://dev.mysql.com/doc/apis-php/en/apis-php-mysqli.overview.html). – Rick James Sep 19 '15 at 02:40