3

In PHP documentation for mysqli_real_escape_string(), it is written that

Caution Security: the default character set

The character set must be set either at the server level, or with the API function mysqli_set_charset() for it to affect mysqli_real_escape_string().

Source mysqli_real_escape_string

In a further link about character set, it is mentioned that

The character set should be understood and defined, as it has an affect on every action, and includes security implications.

Suource Character sets

Why does it necessary to set the character set for the sake of security and what security implications does it include? Can anyone, explain the concept behind those lines ?

Thanks in advance

Aman Singh
  • 743
  • 1
  • 10
  • 20
  • 3
    what needs to be escaped is Character set dependent –  Jun 02 '15 at 23:50
  • @Dagon - What does it even mean ? – Aman Singh Jun 02 '15 at 23:52
  • 1
    not sure how to be clearer - if you look at the php source the Character's that are escaped, is dependent on the Character set used. What is escaped in *latin1* is not the same as what is escaped in *utf8* so the function needs to know what is being used –  Jun 02 '15 at 23:56
  • ok but what does it matter in terms of security ? – Aman Singh Jun 03 '15 at 00:00
  • unescaped strings can lead to my(sql) injection attacks –  Jun 03 '15 at 00:01
  • Ok, You are trying to say, that if one do not specify the character set, the argument of the function would not be escaped which could leads to mysql injection, right ? – Aman Singh Jun 03 '15 at 00:08

2 Answers2

4

How SQL queries are parsed is dependent on the connection character set. If you did this query:

$value = chr(0xE0) . chr(0x5C);
mysql_query("SELECT '$value'");

then if the connection character set was Latin-1 MySQL would see the invalid:

SELECT 'à\'

whereas if the character set were Shift-JIS, the byte sequence 0xE0,0x5C would be interpreted as a double-byte character:

SELECT '濬'

Add string literal escaping for security:

$value = mysql_real_escape_string($value);
mysql_query("SELECT '$value'");

Now if you've correctly set the connection character set to Shift-JIS with mysql_set_charset, MySQL still sees:

SELECT '濬'

But if you haven't set the connection character set, and MySQL's default character set is Shift-JIS but PHP's default character set is ASCII, PHP doesn't know that the trailing 0x5C character is part of a double-byte sequence, and escapes it, thinking it is generating the valid output:

SELECT 'à\\'

whilst MySQL reads it using Shift-JIS as:

SELECT '濬\'

With the trailing ' escaped with a backslash, this has left the string literal open. The next ' character in the query will end the string, leaving whatever follows in raw SQL content. If you can inject there, the query is vulnerable.

This problem only applies to a few East Asian encodings like Shift-JIS where multibyte sequences can contain bytes which on their own are valid ASCII characters like the backslash. If the mismatched encodings both treat low bytes as always-ASCII (strict ASCII supersets like the more-common mismatch of Latin-1 vs UTF-8), no such confusion is possible.

Luckily servers which default to these encodings are uncommon, so in practice this is a rarely-exploitable issue. But if you have to use mysql_real_escape_string you should do it right. (Better to avoid it completely by using parameterised queries though.)

bobince
  • 528,062
  • 107
  • 651
  • 834
1

If you want to secure your applications against SQL injection, you should be using prepared statements and not escaping your input. (Don't let MySQLi or PDO emulate prepares, either; use real prepared statements if you can!)

Only for situations where you can't use prepared statements should escaping ever be considered (dynamically generated queries, LIMIT). In these specific instances, make sure you're not making mysqli_real_escape_string() bypassable because of misconfigured character sets. (This linked StackOverflow answer by ircmaxell explains the problem better than I ever could.)

WordPress recently had a problem where multibyte characters could bypass their SQL escaping strategy and the security team patched it under the guise of Emoji support.

If you're using mysql_real_escape_string() or mysqli_real_escape_string() you are playing with fire. Be careful that you don't get burned.

Community
  • 1
  • 1
Scott Arciszewski
  • 33,610
  • 16
  • 89
  • 206