0

I have been trying to send emojis through post petitions to my server (python server-side) to store in a database. I get the full string and convert it to UTF-8, the problem is that some emojis are well sent and others throw an error on server-side Incorrect string value: '\\xF0\\x9F\\x8E\\xAE

I think this is because some emojis are converted to this %E2%9D%A4%EF%B8%8F on sending like ❤️, but others are converted to this %F0%9F%8E%AE like .

I have tested the petitions through postman and the red heart one works, but the others, with 4 codes don't and I see that error.

Here is some postman log capture enter image description here

And here is the error from Python django API

OperationalError at /api/addcomment
(1366, "Incorrect string value: '\\xF0\\x9F\\x8E\\xAE' for column 'text' at row 1")
Django Version: 2.2.5
Exception Type: OperationalError
Exception Value:    
(1366, "Incorrect string value: '\\xF0\\x9F\\x8E\\xAE' for column 'text' at row 1")
Exception Location: /var/www/vhosts/*/httpdocs/pythonvenv/lib/python3.5/site-packages/MySQLdb/connections.py in query, line 226
Python Executable:  /var/www/vhosts/*/httpdocs/pythonvenv/bin/python
Python Version: 3.5.2
Python Path:    
['/var/www/vhosts/*/httpdocs/pythonvenv/bin',
 '/var/www/vhosts/*/httpdocs/app/app',
 '/var/www/vhosts/*/httpdocs/app',
 '/var/www/vhosts/*/httpdocs',
 '/usr/share/passenger/helper-scripts',
 '/var/www/vhosts/*/httpdocs/pythonvenv/lib/python35.zip',
 '/var/www/vhosts/*/httpdocs/pythonvenv/lib/python3.5',
 '/var/www/vhosts/*/httpdocs/pythonvenv/lib/python3.5/plat-x86_64-linux-gnu',
 '/var/www/vhosts/*/httpdocs/pythonvenv/lib/python3.5/lib-dynload',
 '/usr/lib/python3.5',
 '/usr/lib/python3.5/plat-x86_64-linux-gnu',
 '/var/www/vhosts/*/httpdocs/pythonvenv/lib/python3.5/site-packages']

I have changed the original URL with *

For more info, in phpmyadmin i cannot insert those emojis either (the 4 codes ones like the gamepad) on SQL or insert tab, but i can insert the 6 codes ones like the red heart on SQL or insert tab. I have tried several utf8 and utf8mb4 collations for both column and table.

This happens when inserting an emoji with db, table and column set to utf8mb4 or not enter image description here

Any help? Thanks!

Ian Ortega
  • 79
  • 1
  • 10
  • I believe it would be helpful if you could share the original string values and the code for your current approach when converting them. – Lucas Infante Aug 11 '20 at 10:09
  • @LucasInfante I think the approach on android doesn't matter on this case, because the error is the same sending just the emoji alone through postman, and i have attached some postman log sending just emojis. When i send the red heart, one or many, it works. When I send others, the error appears. Understand what is happening and being able to solve it on postman, will make it work on android. – Ian Ortega Aug 11 '20 at 10:13
  • It should be possible to reduce this question to either the Android side is sending incorrect values, or the Python side is processing the received values incorrectly. – snakecharmerb Aug 11 '20 at 10:18
  • I don't think the emojis are sent incorrect because i have confirmed that some emojis have 4 codes and some others 6 on a UTF-8 converter, the 6 codes ones works, the others doesn't so it has to be a database or a python problem I think – Ian Ortega Aug 11 '20 at 10:21
  • Does this answer your question? https://stackoverflow.com/questions/10957238/incorrect-string-value-when-trying-to-insert-utf-8-into-mysql-via-jdbc "Incorrect string value" is an error given by mysql database, to fix it you have to change the column character set to utf8mb4 – Joni Aug 11 '20 at 11:20
  • @Joni it gives me the same error with utf8mb4 – Ian Ortega Aug 11 '20 at 11:46
  • Can you edit the question to include all the details: which exact component on the server side originates the error (database, api, something else), and the complete error message? – Joni Aug 11 '20 at 11:48
  • @Joni Ok, give me few minutes, I'm at launch now – Ian Ortega Aug 11 '20 at 12:10
  • @Joni I have added more info, have the same error inserting it through phpmyadmin sql or insert tab so I think it's a database problem that I cannot figure to solve. – Ian Ortega Aug 11 '20 at 12:28
  • What are the character set and collation set on the column you call "text"? – Joni Aug 11 '20 at 12:39
  • @Joni thats what I said, i have tried with some spanish and general variants of utf8 and utf8mb4 and the data type of the column is MEDIUMTEXT – Ian Ortega Aug 11 '20 at 12:41

1 Answers1

0

Both of these need to be set to utf8mb4:

  • The column charset

  • The database connection charset

The first one determines what strings can be stored in the column. The second determines the character set for string literals. (Oddly, if you put a 4-byte UTF-8 sequence in a string literal, MySQL can still think it's "3-byte utf8" and doesn't give an error until you try to use it)

To find if the database connection charset is the problem, you can try setting the character set on the string literal explicitly. If this works, the column encoding is fine, but the connection isn't:

insert into demo_table set `text` = _utf8mb4'';

You seem to be using Django. I don't know much about Django but it looks like the connection encoding is set somewhere in the database connection options. Going by https://chriskief.com/2017/06/18/django-and-mysql-emoticons/ :

DATABASES = {
    'default': {
        'ENGINE':'django.db.backends.mysql',
        ...
        'OPTIONS': {'charset': 'utf8mb4'},
    }
}
Joni
  • 108,737
  • 14
  • 143
  • 193
  • Ok, I have tried, and I can tell you what happen with this in phpmyadmin copying emojis and having bd, table, and column on utf8mb4: when inserting the red heart, it still works When inserting the gamepad for example, it still doesn't work. If i go to sql tab and insert it with the `_utf8mb4''` it works, but it insert a question mark `?` so I think it still not working, and it's not a problem of django connection (that maybe it could be fixed for api usage on this), because in phpmyadmin doesn't work, it is a database problem. Thanks for your time :) – Ian Ortega Aug 11 '20 at 13:31
  • set names utf8mb4; and query the database again. – Veleirian Jan 18 '22 at 05:31