0

MySQLdb is a module of python to communicate with mysql database. The escape_string is a method provided by MySQLdb to escape some characters in sql. For example, sql like 'Update table Set col = "My"s"' will cause a error. So escape_string will help us to add a '\' before the " in My"s. However, in multibyte encoding like gbk, which use more than 2 bytes to store a chinese word, the escape_string only search the character to be escaped one character by one, which will cause some special characters to be escaped incorrectly. for example, the Chinese character ' 昞', whose bytes are '\x95\x5c', if the sql is 'update table set col = "昞"', then the MySQLdb.escape_string(sql) will get the result: update table set col = "昞\", which is wrong and cannot be executed correctly. So is there anyone who ever came over such a problem.

P.S I googled the problem and found there is a method mysqli_set_charset in php which can solve such case, So, I wonder whether there is a such one in python.

BenMorel
  • 34,448
  • 50
  • 182
  • 322
Red Lv
  • 1,369
  • 4
  • 13
  • 12

1 Answers1

0

This problem is most likely cause because the default character set for your connection is latin1 instead of unicode. There are a couple different things you can try. From this post,

conn = mysql.connect(host='127.0.0.1',
                     user='user',
                     passwd='passwd',
                     db='db',
                     charset='utf8',
                     use_unicode=True)

then you run your query like this

cursor.execute('INSERT INTO mytable VALUES (null, %s)',                  
               ('\x95\x5c',))

Appearently a similar problem was solved by running the following query first

SET NAMES 'gbk
Community
  • 1
  • 1
John
  • 13,197
  • 7
  • 51
  • 101