0

I have a mySQL table which is set to CHARACTER SET utf8mb4 and a column x which has CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci, and I can run an SQL command directly on the database which inserts a 4-byte unicode character, like

INSERT INTO mytable (x) VALUES ('');

but when I run the following in web2py, I get a different entry, which looks like ????

sql = u"INSERT INTO mytable (x) VALUES (%s)"
db.executesql(sql, (u'',))

Is there something I need to set in web2py somewhere to tell it to pass the unicode characters through without alteration?

Addendum: the same ???? entry occurs when I use the DAL too, as in

db.mytable.insert(x=u'')
user2667066
  • 1,867
  • 2
  • 19
  • 30
  • Update - I get the following in the logs `/usr/local/lib/python2.7/site-packages/pymysql/cursors.py:165: Warning: (1300, u"Invalid utf8 character string: 'F09F92'") result = self._query(query) /usr/local/lib/python2.7/site-packages/pymysql/cursors.py:165: Warning: (1366, u"Incorrect string value: '\\xF0\\x9F\\x92\\xA9' for column 'search_string' at row 1")` . My pyMySQL is the most recent, 0.8.0. – user2667066 Jan 24 '18 at 23:41
  • Also, it *does* work if I use pyMySQL on its own (i.e. not via Web2Py) – user2667066 Jan 24 '18 at 23:49
  • I have reported this as a bug at https://github.com/web2py/web2py/issues/1838 - I hope that's the right thing to do. – user2667066 Jan 25 '18 at 00:55

1 Answers1

1

See "question mark" in Trouble with UTF-8 characters; what I see is not what I stored

See Python notes in http://mysql.rjweb.org/doc.php/charcoll#python . In particular, what are your connection parameters (between Python and MySQL)?

u"Invalid utf8 character string: 'F09F92'" implies that Python (perhaps from interpreting a MySQL utf8 error) is baffled by the 4-byte pile of poo.

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • really helpful pointers, thanks. The odd thing here is that it works using python and pymysql from the command-line, but not from within web2py, which is why I reported it as a bug on the web2py github site. I will check the connection parameters. – user2667066 Jan 25 '18 at 09:35
  • So in the web2py DAL, it doesn't make any difference if I set DAL(db_codec="utf8mb4"...), but it *does* work if I specify the connection parameters in the string that defined the DB, as described in https://stackoverflow.com/questions/39349972/how-do-i-specify-utf8mb4-for-a-mysql-column-in-web2py. That now works, thanks! – user2667066 Jan 25 '18 at 09:50
  • @user2667066 - Oops, I the DAL farther down in my second link; I failed to look for "web2py". – Rick James Jan 25 '18 at 13:45