0

I have a string that looks like this Use O Mozilla Que Não Trava! Testei! $vip ou $apoio

When I try to save it to my database with ...SET description = %s... and cursor.execute(sql, description) it gives me an error

Warning: (1366, "Incorrect string value: '\xF0\x9F\x94\xB4Us...' for column 'description' ...

Assuming this is an ASCII symbol, I tried description.decode('ascii') but this leads to

'str' object has no attribute 'decode'

How can I determine what encoding it is and how could I store anything like that to the database? The database is utf-8 encoded if that is important.

I am using Python3 and PyMySQL.

Any hints appreciated!

PrimuS
  • 2,505
  • 6
  • 33
  • 66

1 Answers1

0

First, you need to make sure the table column has correct character set setting. If it is "latin1" you will not be able to store content that contains Unicode characters.

You can use following query to determine the column character set:

SELECT CHARACTER_SET_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_SCHEMA='your_database_name' AND TABLE_NAME='your_table_name' AND COLUMN_NAME='description'

Following Mysql document here if you want to change column character set.

Also, you need to make sure character set is properly configured for Mysql connection. Quoted from Mysql doc:

Character set issues affect not only data storage, but also communication between client programs and the MySQL server. If you want the client program to communicate with the server using a character set different from the default, you'll need to indicate which one. For example, to use the utf8 Unicode character set, issue this statement after connecting to the server:

SET NAMES 'utf8';

Once character set setting is correct, you will be able to execute your sql statement. There is no need to encode / decode in Python side. That is used for different purposes.

jiulongw
  • 355
  • 3
  • 5
  • The first query gives `utf8` I init with pyMysql with `connect = pymysql.connect(host=constants.HOST, user=constants.USERNAME, password=constants.PASSWORD, db=constants.DATABASE, charset='utf8', port=constants.PORT, cursorclass=pymysql.cursors.DictCursor)` So there should not be an issue actually? – PrimuS Feb 02 '17 at 23:01
  • @PrimuS Just noticed there's an emoji. You need utf8mb4 for column character set. utf8 is not enough. See http://stackoverflow.com/questions/35125933/mysql-utf8mb4-errors-when-saving-emojis – jiulongw Feb 03 '17 at 18:37