3

I have written code to extract tweets from a list of users [handles]. I am writing the information to a .txt file called "results".

with open("results", "w") as fp:

for handle in handles:
    print("Analyzing tweets from " + handle + "...")

    user = api.get_user(id=handle)

    fp.write("Handle: " + handle + "\n")
    fp.write("Name: " + user.name + "\n")
    fp.write("Description: " + str(user.description.encode(sys.stdout.encoding, errors='replace')) + "\n")
    fp.write("Followers: " + str(user.followers_count) + "\n")
    fp.write("Following: " + str(user.friends_count) + "\n")

    tweet_counter = 0
    prosocial_tweets_count = 0
    regular_tweets_count = 0

    all_tweets = []
    social_tweets_len = []
    regular_tweets_len = []
    social_tweets_valence = []
    regular_tweets_valence = []

    regular_attachments = 0
    social_attachments = 0

    for tweet in tweepy.Cursor(api.user_timeline, id=user.id).items():
        #control for timeline
        dt = tweet.created_at
        if dt > date_until:
            continue
        if dt < date_from:
            break # XXX: I hope it's OK to break here
        if include_retweets == "no" and tweet.text.startswith("RT"):
            continue
        if include_replies == "no" and tweet.in_reply_to_user_id:
            continue
        tweet_counter += 1

        for word in vocabulary:
            if word in tweet.text.lower():
                #increase count of pro social tweets
                prosocial_tweets_count += 1
                #clean the tweet for valence analysis
                clean = TextBlob(tweet.text.lower())
                #calculate valence
                valence = clean.sentiment.polarity
                #append the valence to a list 
                social_tweets_valence.append(valence)
                #append the length of the tweet to a list
                social_tweets_len.append(len(tweet.text))

                #check if there is an attachment
                counting = tweet.text.lower()
                counting_attachments = counting.count(" https://t.co/")
                social_attachments = social_attachments + counting_attachments

                #write date
                fp.write("  * " + str(dt) + "\n")
                #write the tweet
                fp.write("    " + str(tweet.text.encode(sys.stdout.encoding, errors='replace')) + "\n")
                #write the length of the tweet
                fp.write("    Length of tweet " + str(len(tweet.text)) + "\n")
                #write the valence of the tweet
                fp.write("    Tweet valance " + str(valence) + "\n")
                #write the retweets of the tweet
                fp.write("    Retweets count: " + str(tweet.retweet_count) + "\n")
                #write the likes of the tweet
                fp.write("    Likes count: " + str(tweet.favorite_count) + "\n")
                # Report each tweet only once whenever it contains more than one prosocial words
                break

            else:
                #this code runs if the tweet is not prosocial
                regular_tweets_count += 1
                clean = TextBlob(tweet.text.lower())
                valence = clean.sentiment.polarity

                counting = tweet.text.lower()
                counting_attachments = counting.count(" https://t.co/")
                regular_attachments = regular_attachments + counting_attachments

                regular_tweets_valence.append(valence)
                regular_tweets_len.append(len(tweet.text))

    attachments = regular_attachments + social_attachments

I was wondering whether anyone knows of any nice way to check if the tweets contains images or videos. I would also like to create a list of average use of images and videos per user.

2 Answers2

1

Data is in JSON format when we fetch it from Twitter API. Though it contains all data about that id, and comment in form of value and fields. So if you just want to check whether image already exist or not you make a conditional statement stating

if(image == TRUE){
 THEN 'yes'
}
ELSE
 'no'
  • 1
    I want to check if tweet.text contains an image, and if so increase the count of either regular attachments or social attachments –  Aug 10 '18 at 09:22
1

If you look at This thread, you will see that all media in a tweet are actually stored in tweet.entities['media'].

Therefore if you want to know if a given tweet (in the format tweepy.models.Status used by tweepy) contains a picture, you could try this:

try:
    print(True in [medium['type'] == 'photo' for medium in tweet.entities['media']])
except:
    print("No picture in this tweet")

I hope it helps.

ysearka
  • 3,805
  • 5
  • 20
  • 41
  • 1
    Thanks, I had solved it using tweets.entities too, but a little differently: tweet_media = str(tweet.entities) if "video" in tweet_media: social_attachments_video += 1 if "photo" in tweet_media: social_attachments_photo += 1 –  Aug 29 '18 at 16:28