I have been working on a program to extract tweets from a Twitter account. It looks like this:
import tweepy
from tweepy import OAuthHandler
import json
import time
import sys
import builtins
consumer_key = ''
consumer_secret = ''
access_token = ''
access_secret = ''
auth = OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_secret)
api = tweepy.API(auth)
user = api.get_user('nytimes')
statuses = api.user_timeline(id = user.id, count = 200)
for status in statuses:
print("***")
print("Tweet id: " + status.id_str)
print(status.text)
print("Retweet count: " + str(status.retweet_count))
print("Favorite count: " + str(status.favorite_count))
print(status.created_at)
print("Status place: " + str(status.place))
print("Source: " + status.source)
print("Coordinates: " + str(status.coordinates))
time.sleep(1)
It works fine... until I get a tweet witrh an emoji. Then I get this error message:
UnicodeEncodeError: 'UCS-2' codec can't encode characters in position 19-19: Non-BMP character not supported in Tk
Doing some research, I found a bit code that is supposed to go around this problem:
def print_ucs2(*args, print=builtins.print, **kwds):
args2 = []
for a in args:
a = str(a)
if max(a) > '\uffff':
b = a.encode('utf-16le', 'surrogatepass')
chars = [b[i:i+2].decode('utf-16le', 'surrogatepass')
for i in range(0, len(b), 2)]
a = ''.join(chars)
args2.append(a)
print(*args2, **kwds)
builtins._print = builtins.print
builtins.print = print_ucs2
The problem is, once I add this bit of code to my program, it ONLY prints emojis. Nothing else. I don't have the error message anymore... but I don't have the tweets either.
I've also read that something could be done with the .encode('utf-8'), but I'm not sure where to put it, so far I've only gotten error messages using this. Any ideas?
Thanks,