With sqlalchemy 0.9.7, I am trying to store emoji into MySQL 5.5 with utf8mb4 enabled. However, for some reason sqlalchemy is killing my emoji chars and I cannot figure out why. I can see the emoji char in the object before attempting to save the data via sqlalchemy. After the save, an error is thrown and emoji is rendered as ????.
The error is the following. Note it is surrounded by debug messages that output type(post.message)
and post.message
.
--------------------------------------------------------------------------------
DEBUG in __init__ [/mnt/hgfs/crw/dev/hyper/hyper/blueprint/chat/__init__.py:274]:
<type 'unicode'>
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
DEBUG in __init__ [/mnt/hgfs/crw/dev/hyper/hyper/blueprint/chat/__init__.py:275]:
--------------------------------------------------------------------------------
/mnt/hgfs/crw/dev/hyper/env/local/lib/python2.7/site-packages/sqlalchemy/engine/default.py:436: Warning: Incorrect string value: '\xF0\x9F\x98\x83' for column 'message' at row 1
cursor.execute(statement, parameters)
--------------------------------------------------------------------------------
DEBUG in __init__ [/mnt/hgfs/crw/dev/hyper/hyper/blueprint/chat/__init__.py:277]:
<type 'unicode'>
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
DEBUG in __init__ [/mnt/hgfs/crw/dev/hyper/hyper/blueprint/chat/__init__.py:278]:
????
--------------------------------------------------------------------------------
My code looks like:
Base = declarative_base()
post = Table('post', Base.metadata,
Column("id", Integer, primary_key=True),
Column("message", UnicodeText),
Column("created_at", DateTime),
Column("updated_at", DateTime),
)
mapper(Post, post)
Is there any other setup I need to do to make this work?
Update: If I do the following:
Base = declarative_base()
post = Table('post', Base.metadata,
Column("id", Integer, primary_key=True),
Column("message", UnicodeText),
Column("created_at", DateTime),
Column("updated_at", DateTime),
mysql_engine='InnoDB',
mysql_charset='utf8mb4'
)
mapper(Post, post)
I get the following:
--------------------------------------------------------------------------------
DEBUG in __init__ [/mnt/hgfs/crw/dev/hyper/hyper/blueprint/chat/__init__.py:274]:
<type 'unicode'>
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
DEBUG in __init__ [/mnt/hgfs/crw/dev/hyper/hyper/blueprint/chat/__init__.py:275]:
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
DEBUG in __init__ [/mnt/hgfs/crw/dev/hyper/hyper/blueprint/chat/__init__.py:277]:
<type 'unicode'>
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
DEBUG in __init__ [/mnt/hgfs/crw/dev/hyper/hyper/blueprint/chat/__init__.py:278]:
????
--------------------------------------------------------------------------------
No error, but it is still obliterating the emoji. What is going on?!
Update 2:
Changing the database URI being called from:
mysql+mysqldb://user:pass@localhost/datab?charset=utf8
To
mysql+mysqldb://user:pass@localhost/datab
Fixed the problem. sqlalchemy threw an error when I tried to use charset=utf8mb4&use_unicode=0
. But is that wise? According to http://docs.sqlalchemy.org/en/rel_0_9/dialects/mysql.html#module-sqlalchemy.dialects.mysql.mysqldb, it is not! Would love any feedback on this solution.