0

Assume that I have to make an enormous number of HTTP requests (and get responses). How do I do that? E.g. using Symfony.

The docs propose:

$responses = [];
for ($i = 0; $i < 379; ++$i) {
    $uri = "https://http2.akamai.com/demo/tile-$i.png";
    $responses[] = $client->request('GET', $uri);
}

foreach ($client->stream($responses) as $response => $chunk) {
    if ($chunk->isFirst()) {
        // headers of $response just arrived
        // $response->getHeaders() is now a non-blocking call
    } elseif ($chunk->isLast()) {
        // the full content of $response just completed
        // $response->getContent() is now a non-blocking call
    } else {
        // $chunk->getContent() will return a piece
        // of the response body that just arrived
    }
}

But what if in my case $i <= 1_000_000_000? It is obvious that it makes no sense to send them all simultaneously. (And is an impossible memory consumption.)

How do I do that nicely? It seems to me that I should somehow add new request after getting a complete response in the loop. But how to do it?

$responses = (static function () use ($client) {
    for ($i = 0; $i < 1e5; $i++) {
        $uri = "http://localhost/test.php?id=$i";
        yield $client->request('GET', $uri);
    }
})();

foreach ($client->stream($responses) as $response => $chunk) {
    if ($chunk->isLast()) {
        // how to schedule next request?
    }
}

Or maybe I can do that easily with guzzle or curl?

whyer
  • 783
  • 5
  • 16
  • The way that I do this is in batches. This should all be done through a CLI and not in a web page or else you're just asking for problems. Anyway, have your script connect to a data store and request `n` items that haven't been processed yet. As each one is processed, mark it as complete in the store. Have the invoker of script auto-restart the script as needed. Either use [cron or a daemon](https://stackoverflow.com/q/38596/231316) of some sort as the invoker. If you want to get really fancy, you can also throw a queue into the mix – Chris Haas Jan 22 '21 at 21:26
  • @ChrisHaas i've just found a much easier way using guzzle. i'll post the complete answer in several minutes. in short, just use guzzle pool: _"You can use the `GuzzleHttp\Pool` object when you have an indeterminate amount of requests you wish to send."_ https://docs.guzzlephp.org/en/stable/quickstart.html#concurrent-requests – whyer Jan 22 '21 at 22:22

1 Answers1

0

You can use the GuzzleHttp\Pool object when you have an indeterminate amount of requests you wish to send: https://docs.guzzlephp.org/en/stable/quickstart.html#concurrent-requests

use GuzzleHttp\Client;
use GuzzleHttp\Pool;
use Psr\Http\Message\ResponseInterface;

$client = new Client();

$onResponse = function (ResponseInterface $response) {
    echo $response->getBody();
};
$requests = (function () use ($client, $onResponse) {
    for ($i = 0; $i < 1e3; $i++) {
        yield fn() => $client->getAsync("http://localhost/test.php?id=$i")
            ->then($onResponse);
    }
})();

(new Pool($client, $requests, ['concurrency' => 10]))->promise()->wait();
whyer
  • 783
  • 5
  • 16