Twitter scarping with snscrape

Asked Jun 06 '22 at 14:21

Active Jun 06 '22 at 14:22

Viewed 514 times

I want to extract a certain number of tweets per day regarding a certain topic for a certain period of time (in this example 100 tweets regarding bitcoin each day for 31 days. The code works, but I see that all extracted tweets are published in the last 2-3 minutes of the day, so basically I am getting tweets from 23.57 each day. The problem is that I need the tweets to be a good sample of each day, not only about the last 3 minutes, so I would need them to be extracted randomly for the day, or like 10 each hour of the day. How can I solve this issue?

day1 = "2022-05-01"
x = 2
tweets_list2 = []
for i in range (30):
    if i < 8: 
        var = "0" + str(x)
    else:
        var = x
    since_date = "2022-05-"+str(var)
    print(day1)
    print(since_date)

for i,tweet in enumerate(sntwitter.TwitterSearchScraper('bitcoin since:{} until:{}'.format(day1,since_date)).get_items()):
    if i>100:
        break
    tweets_list2.append([tweet.date, tweet.content])
x +=1 
day1 = since_date

tweets_df2 = pd.DataFrame(tweets_list2, columns=['Datetime', 'Text'])

edited Jun 06 '22 at 14:22

asked Jun 06 '22 at 14:21

Jacopo Giannetti

for better parse date str, try this: https://stackoverflow.com/q/2265357/13040423 . – AsukaMinato Jun 06 '22 at 20:54

Twitter scarping with snscrape

0 Answers0