Rate limiting yourself from overloading external API's

Question

I found a lot of information and script examples around that showed how to rate limit the users of an API but I wasn't able to find any examples of how to rate limit your own requests of an API when these limits are imposed.

I've always rate limited my scripts with code such as sleep or usleep commands but it feels like an inefficient way of doing things, especially when the API endpoints have pretty high rate limits and hammering API's until you hit the limits is also inefficient.

For example, Google's API limits vary based on the API you are using and can increase/decrease, in this case a fixed rate limit hard coded into the code just seems like primitive guess work!

Have I missed something pretty obvious? Or is this just not as common as I expect it to be?

YOu could implement a queue: Put message/action in queue, execute actions until rate limit hit, interpret limit message and adjust queued messages for the same type so that they stay in the queue until you are allowed to process messages again. — Norbert, Jul 02 '15 at 23:27
Good question, with no simple answer. Rate limiting in API:s can be done in multiple ways, and different on different plattforms. Some API:s can send status responses in the header, for example the Twitter API sends a [HTTP 429 “Too Many Requests” response code][1], but otherwise and in general it is hard for API clients to detect when limits exceed. Upvoted the question, because I'm looking forward to read other answers that might have smart solutions/comments on this broad topic. [1]: https://dev.twitter.com/rest/public/rate-limiting — Michael Krikorev, Jul 03 '15 at 08:55
Thanks both, I have historically done as you suggest @NorbertvanNobelen but it just feels a little primitive. Google for example has a daily limit and a per-second limit on some of it's API's, it's more the "per X seconds/minutes" that would be nice to manage, hitting daily limits is more of a software design issue. I'm presuming an in-memory backend with a simple interface which handles the pause if/when needed? (I guess it could be pretty simple I just don't want to reinvent the wheel!). — williamvicary, Jul 03 '15 at 16:02
+1. Looking into this too at the moment. Idea was to solve the second/minute limit with a worker queue such as Beanstalkd. Since your post is tagged `laravel`, the [queue](http://laravel.com/docs/5.0/queues) docs might be of interest. Another article on [Redis with Celery](https://callhub.io/blog/2014/02/03/distributed-rate-limiting-with-redis-and-celery/) which addresses distributed servers, as opposed to single server. The spotify API like twitter returns a 429 response. Cycling endpoints seems dirty, queuing should be good, perhaps pause queue when it hits a limit only. — Francesco de Guytenaere, Jul 06 '15 at 20:51

score 5 · Answer 1 · answered Jul 07 '15 at 07:03

Well, first things first - you should call any external API's only when you actually need to - the providers will thank you dearly.

There is two ways I usually "impose" a limit on my own API usage - if possible, cache the result for N amount of time, usually a lot less than hard limit of the API itself. This, however, works only in very specific cases.

The second is persistent/semi-persistent counters, where you store a counter in some sort of memory backend along with time when the limiting period begins. Every time before calling API check the storage and see whether the current time minus interval begin and number of requests you have already made is less than imposed by API. If it is, you can make a request - if the interval is larger, you can reset the limit, and if your next request will exceed the limit, and you are still in the previous interval, you can show a pretty error. On each external request then update the interval time if it's exceeded and increment the counter.

score 5 · Accepted Answer · answered Jul 07 '15 at 11:21

Okay, for giggles I've thrown together a limiter class that will allow you to specify the limit per second, minute and hour. I can't resist having a good reason to use a circular queue!

If you have multiple processes doing the consumption, whether simultaneous or not, you'll have to devise a way to store and/or share the usage history on your own.

// LIMITER.PHP
class Limiter
{
  private $queue = array();
  private $size;
  private $next;

  private $perSecond;
  private $perMinute;
  private $perHour;

  // Set any constructor parameter to non-zero to allow adherence to the
  // limit represented. The largest value present will be the size of a
  // circular queue used to track usage.
  // -------------------------------------------------------------------
  function __construct($perSecond=0,$perMinute=0,$perHour=0)
  {
    $this->size = max($perSecond,$perMinute,$perHour);
    $this->next = 0;

    $this->perSecond = $perSecond;
    $this->perMinute = $perMinute;
    $this->perHour   = $perHour;

    for($i=0; $i < $this->size; $i++)
      $this->queue[$i] = 0;
  }

  // See if a use would violate any of the limits specified. We return true
  // if a limit has been hit.
  // ----------------------------------------------------------------------
  public function limitHit($verbose=0)
  {    
    $inSecond = 0;
    $inMinute = 0;
    $inHour   = 0;

    $doneSecond = 0;
    $doneMinute = 0;
    $doneHour   = 0;

    $now = microtime(true);

    if ( $verbose )
      echo "Checking if limitHit at $now<br>\n";

    for ($offset=1; $offset <= $this->size; $offset++)
    {
      $spot = $this->next - $offset;
      if ( $spot < 0 )
        $spot = $this->size - $offset + $this->next;

      if ( $verbose )
        echo "... next $this->next size $this->size offset $offset spot $spot utime " . $this->queue[$spot] . "<br>\n";

      // Count and track within second
      // -----------------------------
      if ( $this->perSecond && !$doneSecond && $this->queue[$spot] >= microtime(true) - 1.0 )
        $inSecond++;
      else
        $doneSecond = 1;

      // Count and track within minute
      // -----------------------------
      if ( $this->perMinute && !$doneMinute && $this->queue[$spot] >= microtime(true) - 60.0 )
        $inMinute++;
      else
        $doneMinute = 1;

      // Count and track within hour
      // ---------------------------
      if ( $this->perHour && !$doneHour && $this->queue[$spot] >= microtime(true) - 3600.0 )
        $inHour++;
      else
        $doneHour = 1;

      if ( $doneSecond && $doneMinute && $doneHour )
        break;
    }

    if ( $verbose )
      echo "... inSecond $inSecond inMinute $inMinute inHour $inHour<br>\n";

    if ( $inSecond && $inSecond >= $this->perSecond )
    {
      if ( $verbose )
        echo "... limit perSecond hit<br>\n";
      return TRUE;
    }
    if ( $inMinute && $inMinute >= $this->perMinute )
    {
      if ( $verbose )
        echo "... limit perMinute hit<br>\n";
      return TRUE;
    }
    if ( $inHour   && $inHour   >= $this->perHour   )
    {
      if ( $verbose )
        echo "... limit perHour hit<br>\n";
      return TRUE;
    }

    return FALSE;
  }

  // When an API is called the using program should voluntarily track usage
  // via the use function.
  // ----------------------------------------------------------------------
  public function usage()
  {
    $this->queue[$this->next++] = microtime(true);
    if ( $this->next >= $this->size )
      $this->next = 0;
  }
}

// ##############################
// ### Test the limiter class ###
// ##############################

$psec = 2;
$pmin = 4;
$phr  = 0;

echo "Creating limiter with limits of $psec/sec and $pmin/min and $phr/hr<br><br>\n";
$monitorA = new Limiter($psec,$pmin,$phr);

for ($i=0; $i<15; $i++)
{
  if ( !$monitorA->limitHit(1) )
  {
    echo "<br>\n";
    echo "API call A here (utime " . microtime(true) . ")<br>\n";
    echo "Voluntarily registering usage<br>\n";
    $monitorA->usage();
    usleep(250000);
  }
  else
  {
    echo "<br>\n";
    usleep(500000);
  }
}

In order to demonstrate it in action I've put in some "verbose mode" statements in the limit checking function. Here is some sample output.

Creating limiter with limits of 2/sec and 4/min and 0/hr

Checking if limitHit at 1436267440.9957
... next 0 size 4 offset 1 spot 3 utime 0
... inSecond 0 inMinute 0 inHour 0

API call A here (utime 1436267440.9957)
Voluntarily registering usage
Checking if limitHit at 1436267441.2497
... next 1 size 4 offset 1 spot 0 utime 1436267440.9957
... next 1 size 4 offset 2 spot 3 utime 0
... inSecond 1 inMinute 1 inHour 0

API call A here (utime 1436267441.2497)
Voluntarily registering usage
Checking if limitHit at 1436267441.5007
... next 2 size 4 offset 1 spot 1 utime 1436267441.2497
... next 2 size 4 offset 2 spot 0 utime 1436267440.9957
... next 2 size 4 offset 3 spot 3 utime 0
... inSecond 2 inMinute 2 inHour 0
... limit perSecond hit

Checking if limitHit at 1436267442.0007
... next 2 size 4 offset 1 spot 1 utime 1436267441.2497
... next 2 size 4 offset 2 spot 0 utime 1436267440.9957
... next 2 size 4 offset 3 spot 3 utime 0
... inSecond 1 inMinute 2 inHour 0

API call A here (utime 1436267442.0007)
Voluntarily registering usage
Checking if limitHit at 1436267442.2507
... next 3 size 4 offset 1 spot 2 utime 1436267442.0007
... next 3 size 4 offset 2 spot 1 utime 1436267441.2497
... next 3 size 4 offset 3 spot 0 utime 1436267440.9957
... next 3 size 4 offset 4 spot 3 utime 0
... inSecond 1 inMinute 3 inHour 0

API call A here (utime 1436267442.2507)
Voluntarily registering usage
Checking if limitHit at 1436267442.5007
... next 0 size 4 offset 1 spot 3 utime 1436267442.2507
... next 0 size 4 offset 2 spot 2 utime 1436267442.0007
... next 0 size 4 offset 3 spot 1 utime 1436267441.2497
... next 0 size 4 offset 4 spot 0 utime 1436267440.9957
... inSecond 2 inMinute 4 inHour 0
... limit perSecond hit

Checking if limitHit at 1436267443.0007
... next 0 size 4 offset 1 spot 3 utime 1436267442.2507
... next 0 size 4 offset 2 spot 2 utime 1436267442.0007
... next 0 size 4 offset 3 spot 1 utime 1436267441.2497
... next 0 size 4 offset 4 spot 0 utime 1436267440.9957
... inSecond 2 inMinute 4 inHour 0
... limit perSecond hit

Checking if limitHit at 1436267443.5027
... next 0 size 4 offset 1 spot 3 utime 1436267442.2507
... next 0 size 4 offset 2 spot 2 utime 1436267442.0007
... next 0 size 4 offset 3 spot 1 utime 1436267441.2497
... next 0 size 4 offset 4 spot 0 utime 1436267440.9957
... inSecond 0 inMinute 4 inHour 0
... limit perMinute hit

Checking if limitHit at 1436267444.0027
... next 0 size 4 offset 1 spot 3 utime 1436267442.2507
... next 0 size 4 offset 2 spot 2 utime 1436267442.0007
... next 0 size 4 offset 3 spot 1 utime 1436267441.2497
... next 0 size 4 offset 4 spot 0 utime 1436267440.9957
... inSecond 0 inMinute 4 inHour 0
... limit perMinute hit

By the way, I have this on Github (with the incredibly open MIT license) in the event anyone wants to use the code outright in a project but are in an environment with strict rules on clear licensing. https://github.com/gvroom/snippets — A Smith, Jul 14 '15 at 10:27

score 2 · Answer 3 · answered Aug 02 '19 at 15:48

2

Wrap your API calls with Jobs and push them to separate queue:
```
ApiJob::dispatch()->onQueue('api');
```
Use queue rate limiting with Redis or with mxl/laravel-queue-rate-limit package (I'm the author). See also SO answer about its usage.
If using mxl/laravel-queue-rate-limit then after its setup run queue worker:
```
$ php artisan queue:work --queue api
```

answered Aug 02 '19 at 15:48

mixel

25,177
13
126
165

Neat and tidy approach! – williamvicary Aug 02 '19 at 20:17

score 0 · Answer 4 · answered Jul 09 '15 at 12:04

I think we can not answer your question in a few sentences . It takes a true reflection of architecture linked to your application . For me to make an API rate limit for repeat I use caches that store values and utilization of my API. I have to date found no code ready .

Rate limiting yourself from overloading external API's

4 Answers4

Linked