2

I'm a bit confused about the search API. Let's suppose I query for "foobar", the following code:

from twython import Twython
api = Twython(...)
r = api.search(q="foobar")

In this way I have 15 statuses and a "next_results" in r["metadata"]. Is there any way to bounce back those metadata to the Twython API and have the following status updates as well, or shall I get the next until_id by hand from the "next_results" and perform a brand new query?

petrux
  • 1,743
  • 1
  • 17
  • 30
  • Which version of Twython are you using? The search method changed in Twython 3.1.0. – Ben Oct 12 '13 at 02:53
  • I just did `pip install twython` so I assume I'm using the 3.0.0. I found [this](https://github.com/gawbul/tweet_aggregator/blob/master/tweet_aggregator.py) workaround anyway. – petrux Oct 13 '13 at 09:59
  • Cool. BTW, you can see the version in the python shell by doing: import twython ; print(twython.__version__) – Ben Oct 13 '13 at 10:05
  • It says I'm using version 3.1.0. At this point, I'm a bit confused... – petrux Oct 14 '13 at 10:32
  • Okay, there's a bit on the updated method for searching for Twython in the docs to use the cursor function, but I have yet to play with it and thus have no sample code yet. The API search function does allow up to 100 results in a single query, though, it's just that 15 is the default. – Ben Oct 14 '13 at 13:49

1 Answers1

14

petrux, "next_results" is returned with metadata "max_id" and since_id which should be used to bounce back and loop through the timeline until we get desired number of tweets.

Here is the update on it from twitter on how to do it: https://dev.twitter.com/docs/working-with-timelines

Below is the sample code which might help.

tweets = []
MAX_ATTEMPTS = 10
COUNT_OF_TWEETS_TO_BE_FETCHED = 500 

for i in range(0,MAX_ATTEMPTS):

    if(COUNT_OF_TWEETS_TO_BE_FETCHED < len(tweets)):
        break # we got 500 tweets... !!

    #----------------------------------------------------------------#
    # STEP 1: Query Twitter
    # STEP 2: Save the returned tweets
    # STEP 3: Get the next max_id
    #----------------------------------------------------------------#

    # STEP 1: Query Twitter
    if(0 == i):
        # Query twitter for data. 
        results = api.search(q="foobar",count='100')
    else:
        # After the first call we should have max_id from result of previous call. Pass it in query.
        results = api.search(q="foobar",include_entities='true',max_id=next_max_id)

    # STEP 2: Save the returned tweets
    for result in results['statuses']:
        tweet_text = result['text']
        tweets.append(tweet_text)


    # STEP 3: Get the next max_id
    try:
        # Parse the data returned to get max_id to be passed in consequent call.
        next_results_url_params = results['search_metadata']['next_results']
        next_max_id = next_results_url_params.split('max_id=')[1].split('&')[0]
    except:
        # No more next pages
        break
kundan
  • 1,278
  • 14
  • 27
  • did you need to add a i+=1? – jxn Feb 10 '15 at 19:44
  • No, the `for i in range(0,MAX_ATTEMPTS):` statement auto-increments the counter... – kundan Feb 11 '15 at 05:28
  • when i tried this i only got 235 tweets and then it breaks. any idea why? – jxn Feb 11 '15 at 07:09
  • Mostly, it would happen becuase there are no more tweets. Check where is it breaking out from. Put a print for `i` in try section of step 3 and print for `next_results_url_params` in the except section.... – kundan Feb 12 '15 at 07:51
  • 1
    just realized u missed count='100' in the second `api.search()` statement. This causes the call to only retrieve 15 tweets after the first loop. – jxn Nov 02 '15 at 00:39
  • Is there a maximum number of tweets that can be fetched via this method? – Casey Jul 10 '19 at 17:57