6

I'm trying to get all tweets(count total tweet number) belong to hashtag. My function is here, how to I use maxID and sinceID for get all tweets. What is the instead of "count"? I dont'know.

if (maxid != null)
        {
            var searchResponse =
                await
                (from search in ctx.Search
                 where search.Type == SearchType.Search &&
                 search.Query == "#karne" &&
                 search.Count == Convert.ToInt32(count)
                 select search)
                 .SingleOrDefaultAsync();

            maxid = Convert.ToString(searchResponse.SearchMetaData.MaxID);

            foreach (var tweet in searchResponse.Statuses)
            {
                try
                {
                    ResultSearch.Add(new KeyValuePair<String, String>(tweet.ID.ToString(), tweet.Text));
                    tweetcount++;
                }
                catch {}
            }

            while (maxid != null && tweetcount < Convert.ToInt32(count))
            {
                maxid = Convert.ToString(searchResponse.SearchMetaData.MaxID);
                searchResponse =
                    await
                    (from search in ctx.Search
                     where search.Type == SearchType.Search &&
                     search.Query == "#karne" &&
                     search.Count == Convert.ToInt32(count) && 
                     search.MaxID == Convert.ToUInt64(maxid)
                     select search)
                     .SingleOrDefaultAsync();
                foreach (var tweet in searchResponse.Statuses)
                {
                    try
                    {
                        ResultSearch.Add(new KeyValuePair<String, String>(tweet.ID.ToString(), tweet.Text));
                        tweetcount++;
                    }
                    catch { }
                }
            }

        }
Batuhan Tozun
  • 487
  • 1
  • 5
  • 11

3 Answers3

10

Here's an example. Remember that MaxID is for the current session and prevents re-reading tweets you've already processed in the current session. SinceID is the oldest tweet you've ever received for this search term and helps you avoid re-reading tweets that you've already processed for this search term during previous sessions. Essentially, you're creating a window where MaxID is the newest tweet to get next and SinceID is the oldest tweet that you don't want to read past. On the first session for a given search term, you would set SinceID to 1 because you don't have an oldest tweet yet. After the session, save SinceID so that you don't accidentally re-read tweets.

    static async Task DoPagedSearchAsync(TwitterContext twitterCtx)
    {
        const int MaxSearchEntriesToReturn = 100;

        string searchTerm = "twitter";

        // oldest id you already have for this search term
        ulong sinceID = 1;

        // used after the first query to track current session
        ulong maxID; 

        var combinedSearchResults = new List<Status>();

        List<Status> searchResponse =
            await
            (from search in twitterCtx.Search
             where search.Type == SearchType.Search &&
                   search.Query == searchTerm &&
                   search.Count == MaxSearchEntriesToReturn &&
                   search.SinceID == sinceID
             select search.Statuses)
            .SingleOrDefaultAsync();

        combinedSearchResults.AddRange(searchResponse);
        ulong previousMaxID = ulong.MaxValue;
        do
        {
            // one less than the newest id you've just queried
            maxID = searchResponse.Min(status => status.StatusID) - 1;

            Debug.Assert(maxID < previousMaxID);
            previousMaxID = maxID;

            searchResponse =
                await
                (from search in twitterCtx.Search
                 where search.Type == SearchType.Search &&
                       search.Query == searchTerm &&
                       search.Count == MaxSearchEntriesToReturn &&
                       search.MaxID == maxID &&
                       search.SinceID == sinceID
                 select search.Statuses)
                .SingleOrDefaultAsync();

            combinedSearchResults.AddRange(searchResponse);
        } while (searchResponse.Any());

        combinedSearchResults.ForEach(tweet =>
            Console.WriteLine(
                "\n  User: {0} ({1})\n  Tweet: {2}",
                tweet.User.ScreenNameResponse,
                tweet.User.UserIDResponse,
                tweet.Text));
    }

This approach seems like a lot of code, but really gives you more control over the search. e.g. you can examine tweets and determine how many times to query based on the contents of a tweet (like CreatedAt). You can wrap the query in a try/catch block to watch for HTTP 429 when you've exceeded your rate limit or twitter has a problem, allowing you to remember where you were and resume. You could also monitor twitterContext RateLimit properties to see if you're getting close and avoid an exception for HTTP 429 ahead of time. Any other technique to blindly read N tweets could force you to waste rate-limit and make your application less scalable.

  • Tip: Remember to save SinceID for the given search term, if you're saving tweets, to keep from re-reading the same tweets the next time you do a search with that search term.

For more info on the mechanics of this, read Working with Timelines in the Twitter docs.

Joe Mayo
  • 7,501
  • 7
  • 41
  • 60
  • This code does not really work. It goes into an endless loop that EATS memory on the device. Reached over 1.3 GB of RAM and then crashes. Was using internet all along as well. What am I doing wrong? I used the EXACT same code – Everyone Feb 24 '17 at 22:38
  • @Everyone The search term "twitter" is bringing back a lot of tweets. So, you can change it to something like "LINQ to Twitter", which doesn't get as much traffic. You can also check the Created At date to make sure you go only so far back in time. Another option is to set a number of tweets to stop at. Also, notice that SinceID is set to 1, meaning the search will continue until either Twitter stops producing results or you reach the SinceID. Saving your most recent SinceID for subsequent calls helps avoid requesting duplicate tweets. Read the Working with Timelines link for more information. – Joe Mayo Feb 26 '17 at 03:07
  • Yeah I realized that. It doesn't stop until the condition tells it to do so. Interesting method. +1 :) – Everyone Feb 26 '17 at 06:38
0

Just wanted to say that with Tweetinvi it would be as simple as :

// If you want to handle RateLimits
RateLimit.RateLimitTrackerOption = RateLimitTrackerOptions.TrackAndAwait;

var tweets = Search.SearchTweets(new TweetSearchParameters("#karne")
{
    MaximumNumberOfResults = 10000
    MaxId = 243982 // If you want to start at a specific point
});
Linvi
  • 2,077
  • 1
  • 14
  • 28
  • Is it really gets all of tweets? – Batuhan Tozun Jan 26 '16 at 16:28
  • This might be okay in simple scenarios. However, it could be wasteful in that you can easily read duplicate tweets on subsequent searches and might exceed rate limits on high number of tweets, causing exceptions that reduce performance and scalability. – Joe Mayo Jan 26 '16 at 17:47
  • Is it really gets all of tweets? YES it does at one point of time. But as Joe gave you a solution I am happy it solved it. – Linvi Jan 27 '16 at 15:59
  • I have added the example for RateLimits and MaxId for users who would be interested! – Linvi Jan 27 '16 at 16:03
0

TweetInvi is even simpler now. All you need to do is:

var matchingTweets = Search.SearchTweets("#AutismAwareness");