I've learned how to send tweets with Python, but I'm wondering if it's possible to send emojis or other special Unicode characters in the tweets.
For example, when I try to tweet u'1F430', it simply shows up as "1F430" in the tweet.
I've learned how to send tweets with Python, but I'm wondering if it's possible to send emojis or other special Unicode characters in the tweets.
For example, when I try to tweet u'1F430', it simply shows up as "1F430" in the tweet.
>>> len(u'1f430')
5
>>> len(u'\U0001F430')
1 # the latter might be equal to two in Python 2 on a narrow build (Windows, OS X)
The former is 5 characters, the latter is a single character.
If you want to specify the character in Python source code then you could use its name for readability:
>>> print(u"\N{RABBIT FACE}")
Note: it might not work in Windows console. To display non-BMP Unicode characters there, you could use win-unicode-console + ConEmu.
If you are reading it from a file, network, etc then this character is no different from any other: to decode bytes into Unicode text, you should specify a character encoding e.g.:
import io
with io.open('filename', encoding='utf-8') as file:
text = file.read()
Which specific encoding to use depends on the source e.g., see A good way to get the charset/encoding of an HTTP response in Python
u'1F430' is the literal string "1F430". What character are you trying to get? In general you can get literal bytes into a python string using "\x20", e.g.
>>> print(b"#\x20#")
# #
The byte with hexadecimal value of 20 (decimal 32) in between 2 hashes. Bytes are decoded as ASCII by default, and ASCII char (hex) 20 is a space.
>>> print(u"#\u0020#")
# #
>>> print(u"#\U0001F430#")
# #
Unicode codepoint 20 (a single space) in the middle of 2 hashes
See https://docs.python.org/3.3/howto/unicode.html for more info. NB It can get a little confusing since python will implicitly convert between bytes and unicode (using the ASCII encoding) in a lot of cases, which can hide the issue from you for a while.