2

I'm trying to analyze political tweets.

When I run this code:

import tweepy
from tweepy import OAuthHandler
import datetime

consumer_key    = '...'
consumer_secret = '...'
access_token    = '...'
access_secret   = '...'

auth = OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_secret)

api = tweepy.API(auth)

username = "VP"
startDate = datetime.datetime(2017, 12, 1, 0, 0, 0)
endDate =   datetime.datetime(2017, 12, 2, 0, 0, 0)

tweets = []
tmpTweets = api.user_timeline(username)
for tweet in tmpTweets:
    if tweet.created_at < endDate and tweet.created_at > startDate:
        tweets.append(tweet)

while (tmpTweets[-1].created_at > startDate):
    print("Last Tweet @", tmpTweets[-1].created_at, "...fetching more")
    tmpTweets = api.user_timeline(username, max_id = tmpTweets[-1].id)
    for tweet in tmpTweets:
        if tweet.created_at < endDate and tweet.created_at > startDate:
            tweets.append(tweet)

for tweet in tweets:
    print(tweet.created_at)

I get this:

Last Tweet @ 2017-12-02 13:52:36 ...fetching more
2017-12-01 21:06:35
2017-12-01 12:29:27
2017-12-01 12:27:36
2017-12-01 00:50:17
2017-12-01 00:47:42
2017-12-01 00:25:32

But this is wrong. VP tweeted 3 times on Dec 1. These timestamps appear to be ahead by 4 hours. How do I fix this for Eastern Time?

svadhisthana
  • 163
  • 2
  • 16

1 Answers1

4

Timezones..

.created_at is in UTC (+0000) while your startDate and endDate are local datetime objects.

One way to approach is to convert the created_at to local and then compare:

created_date_local = datetime_from_utc_to_local(tweet.created_at)
if endDate > created_date_local > startDate:
    # ...

where datetime_from_utc_to_local is defined as:

def datetime_from_utc_to_local(utc_datetime):
    now_timestamp = time.time()
    offset = datetime.datetime.fromtimestamp(now_timestamp) - datetime.datetime.utcfromtimestamp(now_timestamp)
    return utc_datetime + offset

This is just one way to do it, not necessarily the best.

Prints 3 tweets only, as desired:

Last Tweet @ 2017-12-02 13:52:36 ...fetching more
2017-12-01 21:06:35
2017-12-01 12:29:27
2017-12-01 12:27:36
alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195
  • I tied this method: from datetime import timedelta ... # Convert to Eastern Time startDate = startDate + timedelta(hours=4) endDate = endDate + timedelta(hours=4) ... print(tweet.created_at - timedelta(hours=4)) Seems to be working. Thank you for your help! – svadhisthana Dec 06 '17 at 05:44