0

I am running a parallel.foreach loop to loop through a list. Each of the list items contains an identifier for an api, which I am accessing within the loop.

The api I am accessing can has a maximum of 225 requests per minute, so I would like to pause execution of the loop after 220 items and resume them again once the full minute has passed. I tried with Thread.sleep(numMilliSeconds), but it seems to start up a new thread for each one that goes to sleep or something of that nature.

This is roughly what I am working with now:

Parallel.ForEach(list, (currentItem) =>{

while(numRequestsLastMinute > 220 && DateTime.Now.Minute == lastDownloadTime.Minute)
                {
                    var timeToPause = (60 - DateTime.Now.Second) * 1000;
                    Console.WriteLine("Thread pausing for" + timeToPause/100 +  "seconds...");
                    Thread.Sleep(timeToPause);
                    Console.WriteLine("Thread resuming...");
                }

                if(DateTime.Now.Minute > lastDownloadTime.Minute)
                {
                    lastDownloadTime = DateTime.Now;
                    numRequestsLastMinute = 0;
                }
//send requests

}

Clearly, the Thread.Sleep is not the right way to go about this, but is there a similar construct I can use within a Parallel.Foreach loop?

  • [Task.Delay](https://stackoverflow.com/a/20084603/884561)? – Kenneth K. Jun 03 '20 at 21:36
  • 1
    Perhaps what you need to do is [_cancel_](https://learn.microsoft.com/dotnet/api/system.threading.tasks.paralleloptions.cancellationtoken#System_Threading_Tasks_ParallelOptions_CancellationToken) any pending requests when the minute has elapsed and requeue those not completed for the next time interval. – Lance U. Matthews Jun 03 '20 at 21:39
  • How about run Parallel.ForEach per batch of 200? while... remainingItems > 0 ... get 200 items ... parallel process... next 200 ... You get what I mean. – Algef Almocera Jun 03 '20 at 21:42
  • 60/225 = .267 seconds/request. Said differently, you can get a maximum of 3.75 requests per second. So just go with 3 requests per second. Why not just use a regular For Each loop (possibly inside a thread) and then pause for 300 milliseconds between each request. You won't go over the threshold then. You could also accomplish this with a Timer control that has an Interval of 300 milliseconds. – Idle_Mind Jun 03 '20 at 21:58

2 Answers2

0

I went with a batch solution. Thanks for the tip, @Algef Almocera

int maxPerMinute = 220

while (list.Count > 0)
            {

                _ = Parallel.ForEach(batch, (currentItem) =>
                {

                });


                batch = list.Take(maxPerMinute);
                list = list.Skip(maxPerMinute).ToList();

                Console.WriteLine(numItemsDone + " items downloaded");

                if (DateTime.Now.Minute == lastDownloadTime.Minute)
                {
                    var timeToPause = (60 - DateTime.Now.Second) * 1000;
                    Console.WriteLine(DateTime.Now.ToLongTimeString() + ": Thread pausing for " + timeToPause / 1000 + "seconds...");
                    Thread.Sleep(timeToPause);
                    Console.WriteLine(DateTime.Now.ToLongTimeString() + ": Thread resuming...");
                }

                lastDownloadTime = DateTime.Now;

            }//end while
0

You want to stop/pause each of the tasks, if 220 requests per minute is reached. So each of them could reach it. So each of it should checking it. If it happens, all the tasks should wait until somebody releases them.

So I would have a queue for the timestamps of last (0...220) API calls. And a lock object instance.

Inside the task - in an forever loop (with cancel abort condition):

  • enter the lock, and inside do:
    • check the next entry in queue to dequeue, if older than 1 min delete
    • do above point until no one older than 1 minute
    • if still more then 220 entries
      • wait inside this task until the time is elapsed until the next queue entry is elapsed - so calculate the waiting time and wait
      • remove the queue entry (now 1 is free - for this task)
    • add/enqueue the current timestamp to the queue
  • leave the lock
  • make the API call

--> so the whole code with the lock could be placed in a method and called from the task

Do I understand you right, that you should not exceed 225 requests in any 60s or in every absolute minute starting with UTC 0.000 s?

PS: I had a similar problem, but that was locked to a day in local time zone - e. g. Instagram allowed once only to post 100 pictures in 24 hours of a day in local time zone! So from 22:00 to 02:00 next morning still 200 pictures could be posted, if no other ones were posted on both days.

BitLauncher
  • 587
  • 4
  • 15