Query Twitter Status by Using Python and Tweepy

Question

I try to query a specified user's tweets with a specified key word included in the tweet text. Here is my code:

# Import Tweepy, sleep, credentials.py
import tweepy
from time import sleep
from credentials import *

# Access and authorize our Twitter credentials from credentials.py
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)

SCREEN_NAME = "BachelorABC"
KEYWORD = "TheBachelor"

def twtr2():
    raw_tweets = tweepy.Cursor(api.search, q=KEYWORD, lang="en").items(50)
    for tweet in raw_tweets:
        if tweet['user']['screen_name'] == SCREEN_NAME:
            print tweet
twtr2()

I get the error message as below:

Traceback (most recent call last):
  File "test2.py", line 19, in <module>
    twtr2()
  File "test2.py", line 17, in twtr2
    if tweet['user']['screen_name'] == SCREEN_NAME:
TypeError: 'Status' object has no attribute '__getitem__'

I googled a lot and thought that maybe I needed to save Twitter's JSON in python first, so I tried the following:

import tweepy, json
from time import sleep
from credentials import *

auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)

SCREEN_NAME = "BachelorABC"
KEYWORD = "TheBachelor"

raw_tweets = tweepy.Cursor(api.search, q=KEYWORD, lang="en").items(50)
for tweet in raw_tweets:
    load_tweet = json.loads(tweet)
    if load_tweet['user']['screen_name'] == SCREEN_NAME:
        print tweet

However, the result is sad:

Traceback (most recent call last):
  File "test2.py", line 35, in <module>
    load_tweet = json.loads(tweet)
  File "C:\Python27\lib\json\__init__.py", line 339, in loads
    return _default_decoder.decode(s)
  File "C:\Python27\lib\json\decoder.py", line 364, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
TypeError: expected string or buffer

Does anyone know what's wrong with my code? And can you help me to fix it?

Thanks in advance!

score 1 · Accepted Answer · answered Mar 08 '17 at 04:36

I figured out. Here is the solution:

# Import Tweepy, sleep, credentials.py
import tweepy
from time import sleep
from credentials import *

# Access and authorize our Twitter credentials from credentials.py
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)

SCREEN_NAME = "BachelorABC"
KEYWORD = "TheBachelor"
for tweet in tweepy.Cursor(api.search, q=KEYWORD, lang="en").items(200):
    if tweet.user.screen_name == SCREEN_NAME:
        print tweet.text
        print tweet.user.screen_name

Please do note that this is not an efficient way to locate the tweets with both specified conditions (screen_name and keyword) satisfied. This is because we query by keyword first, and then query by screen_name. If the keyword is very popular, like what I use here "TheBachelor", with a limited number of tweets (200), we may find none of the 200 tweets are sent by the specified screen_name. I think if we can query by screen_name first, and then by keyword, maybe it will provide a better result. But that's out of the discussion.

I will leave you here.

score 0 · Answer 2 · edited May 23 '17 at 12:31

0

The issue is with the

load_tweet = json.loads(tweet)

The "tweet" object is not a JSON object. If you want to use JSON objects, follow this stackoverflow post on how to use JSON objects with tweepy.

To achieve what you are trying to do (print each tweet of a feed of 50), I would follow what was stated in the getting started docs:

import tweepy

auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)

api = tweepy.API(auth)

public_tweets = api.home_timeline()
for tweet in public_tweets:
    print(tweet.text)

edited May 23 '17 at 12:31

Community

1
1

answered Mar 08 '17 at 03:35

ethanchewy

552
1
3
14

Thanks for your time @ethanchewy I am looking for tweets with **specified** screen_name and keyword. Your answer is about the first 20 tweets in general. Do you have any idea to locate the tweets with these conditions? – Counter10000 Mar 08 '17 at 03:52
@LinguisticsStudent Take a look at the last code snippet located over here: https://github.com/tweepy/tweepy/blob/master/docs/code_snippet.rst . You would store the screen_names in a list and then search within that list for a certain screen_name. Note that Twitter has strict limitations for querying. – ethanchewy Mar 08 '17 at 03:54
Thanks @ethanchewy. The page you quote above is to retrieve the screen_name from follower or user, not status. I will update if I find an answer later. – Counter10000 Mar 08 '17 at 04:01
FYI, @ethanchewy I posted my answer above. – Counter10000 Mar 08 '17 at 04:37

Query Twitter Status by Using Python and Tweepy

2 Answers2