I'm collecting tweets for a big number of users, so the script will run for days/weeks unsupervised.
I have a list of user_ids in big_list
.
I think some of the tweets are private and my script stops so I'd like a way for the script to continue on to the next user_id (and maybe print a warning message).
I'd also like suggestions on how to make it robust to other errors or exceptions (for example, for the script to sleep on error or timeout)
This is a summary of what I have:
import tweepy
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
my_api = tweepy.API(auth)
for id_str in big_list:
all_tweets = get_all_tweets(id_str=id_str, api=my_api)
#Here: insert some tweets into my database
The get_all_tweets
function throws the errors and it basically repeatedly calls:
my_api.user_timeline(user_id = id_str, count=200)
Just in case, the traceback it gives is the following:
/home/username/anaconda/lib/python2.7/site-packages/tweepy/binder.pyc in execute(self)
201 except Exception:
202 error_msg = "Twitter error response: status code = %s" % resp.status
--> 203 raise TweepError(error_msg, resp)
204
205 # Parse the response payload
TweepError: Not authorized.
Let me know if you need more details. Thanks!
----------- EDIT --------
This question has some info.
I guess I can try to do a try/except
block for different type of errors? I don't know of all the relevant, so best practices of someone with field experience would be appreciated!
---------- EDIT 2 -------
I'm getting some Rate limit exceeded errors
so I'm making the loop sleep like this. The else
part would handle the "Not authorized" error and some other (unknown?) errors. This still makes me loose an element in the big_list
though.
for id_str in big_list:
try:
all_tweets = get_all_tweets(id_str=id_str, api=my_api)
# HERE: save tweets
except tweepy.TweepError, e:
if e == "[{u'message': u'Rate limit exceeded', u'code': 88}]":
time.sleep(60*5) #Sleep for 5 minutes
else:
print e