How to retrieve more than 100 results using Twitter4j

Question

I'm using the Twitter4j library to retrieve tweets, but I'm not getting nearly enough for my purposes. Currently, I'm getting that maximum of 100 from one page. How do I implement maxId and sinceId into the below code in Processing in order to retrieve more than the 100 results from the Twitter search API? I'm totally new to Processing (and programming in general), so any bit of direction on this would be awesome! Thanks!

void setup() {

  ConfigurationBuilder cb = new ConfigurationBuilder();
  cb.setOAuthConsumerKey("xxxx");
  cb.setOAuthConsumerSecret("xxxx");
  cb.setOAuthAccessToken("xxxx");
  cb.setOAuthAccessTokenSecret("xxxx");

  Twitter twitter = new TwitterFactory(cb.build()).getInstance();
  Query query = new Query("#peace");
  query.setCount(100);

  try {
    QueryResult result = twitter.search(query);
    ArrayList tweets = (ArrayList) result.getTweets();

    for (int i = 0; i < tweets.size(); i++) {
      Status t = (Status) tweets.get(i);

      GeoLocation loc = t.getGeoLocation();

      if (loc!=null) {
        tweets.get(i++);

        String user = t.getUser().getScreenName();
        String msg = t.getText();

        Double lat = t.getGeoLocation().getLatitude();
        Double lon = t.getGeoLocation().getLongitude();

        println("USER: " + user + " wrote: " + msg + " located at " + lat + ", " + lon);

      }
    }
  }

  catch (TwitterException te) {
    println("Couldn't connect: " + te);
  };
}

void draw() {
}

possible duplicate of [Is it possible to get more than 100 tweets?](http://stackoverflow.com/questions/17887984/is-it-possible-to-get-more-than-100-tweets) — surhidamatya, May 08 '15 at 05:07

score 24 · Accepted Answer · edited May 23 '17 at 10:29

Unfortunately you can't, at least not in a direct way such as doing

query.setCount(101);

As the javadoc says it will only allow up to 100 tweets.

In order to overcome this, you just have to ask for them in batches and in every batch set the maximum ID that you get to be 1 less than the last Id you got from the last one. To wrap this up, you gather every tweet from the process into an ArrayList (which by the way should not stay generic, but have its type defined as ArrayList<Status> - An ArrayList that carries Status objects) and then print everything! Here's an implementation:

void setup() {

  ConfigurationBuilder cb = new ConfigurationBuilder();
  cb.setOAuthConsumerKey("xxxx");
  cb.setOAuthConsumerSecret("xxxx");
  cb.setOAuthAccessToken("xxxx");
  cb.setOAuthAccessTokenSecret("xxxx");

  Twitter twitter = new TwitterFactory(cb.build()).getInstance();
  Query query = new Query("#peace");
  int numberOfTweets = 512;
  long lastID = Long.MAX_VALUE;
  ArrayList<Status> tweets = new ArrayList<Status>();
  while (tweets.size () < numberOfTweets) {
    if (numberOfTweets - tweets.size() > 100)
      query.setCount(100);
    else 
      query.setCount(numberOfTweets - tweets.size());
    try {
      QueryResult result = twitter.search(query);
      tweets.addAll(result.getTweets());
      println("Gathered " + tweets.size() + " tweets");
      for (Status t: tweets) 
        if(t.getId() < lastID) lastID = t.getId();

    }

    catch (TwitterException te) {
      println("Couldn't connect: " + te);
    }; 
    query.setMaxId(lastID-1);
  }

  for (int i = 0; i < tweets.size(); i++) {
    Status t = (Status) tweets.get(i);

    GeoLocation loc = t.getGeoLocation();

    String user = t.getUser().getScreenName();
    String msg = t.getText();
    String time = "";
    if (loc!=null) {
      Double lat = t.getGeoLocation().getLatitude();
      Double lon = t.getGeoLocation().getLongitude();
      println(i + " USER: " + user + " wrote: " + msg + " located at " + lat + ", " + lon);
    } 
    else 
      println(i + " USER: " + user + " wrote: " + msg);
  }
}

Note: The line

ArrayList<Status> tweets = new ArrayList<Status>();

should properly be:

List<Status> tweets = new ArrayList<Status>();

because you should always use the interface in case you want to add a different implementation. This of course, if you are on Processing 2.x will require this in the beginning:

import java.util.List;

Awesome! The only issue I seem to be having now is that all results greater than 100 are just a re-iteration of the first 100 tweets.. — hapless_cap, Sep 15 '13 at 23:11
I seemed to have forgotten a line when splitting up the code to gather the tweets into the list.I edited accordingly! It should be working now! — Petros Koutsolampros, Sep 16 '13 at 17:47

score 2 · Answer 2 · answered Nov 01 '14 at 20:32

Here's the function I made for my app based on the past answers. Thank you everybody for your solutions.

List<Status> tweets = new ArrayList<Status>();

void getTweets(String term)
{
int wantedTweets = 112;
long lastSearchID = Long.MAX_VALUE;
int remainingTweets = wantedTweets;
Query query = new Query(term);
 try
{ 

  while(remainingTweets > 0)
  {
    remainingTweets = wantedTweets - tweets.size();
    if(remainingTweets > 100)
    {
      query.count(100);
    }
    else
    {
     query.count(remainingTweets); 
    }
    QueryResult result = twitter.search(query);
    tweets.addAll(result.getTweets());
    Status s = tweets.get(tweets.size()-1);
    firstQueryID = s.getId();
    query.setMaxId(firstQueryID);
    remainingTweets = wantedTweets - tweets.size();
  }

  println("tweets.size() "+tweets.size() );
}
catch(TwitterException te)
{
  System.out.println("Failed to search tweets: " + te.getMessage());
  System.exit(-1);
}
}

score 1 · Answer 3 · answered Aug 27 '15 at 07:58

From the Twitter search API doc: At this time, users represented by access tokens can make 180 requests/queries per 15 minutes. Using application-only auth, an application can make 450 queries/requests per 15 minutes on its own behalf without a user context. You can wait for 15 min and then collect another batch of 400 Tweets, something like:

            if(tweets.size() % 400 == 0 ) {
            try {
                    Thread.sleep(900000);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
            }

score 0 · Answer 4 · answered Sep 16 '13 at 12:32

Just keep track of the lowest Status id and use that to set the max_id for subsequent search calls. This will allow you to step back through the results 100 at a time until you've got enough, e.g.:

boolean finished = false;
while (!finished) {
    final QueryResult result = twitter.search(query);    

    final List<Status> statuses = result.getTweets();
    long lowestStatusId = Long.MAX_VALUE;
    for (Status status : statuses) {
        // do your processing here and work out if you are 'finished' etc... 

        // Capture the lowest (earliest) Status id
        lowestStatusId = Math.min(status.getId(), lowestStatusId);
    }

    // Subtracting one here because 'max_id' is inclusive
    query.setMaxId(lowestStatusId - 1);
}

See Twitter's guide on Working with Timelines for more information.

How to retrieve more than 100 results using Twitter4j

4 Answers4

Linked

Related