27

Problem

We are trying to do concurrent asynchronous requests using guzzle. After going through a few resources, like this and this, we came up with some code that is shared below. However it is not working as expected.

It looks like Guzzle is doing these request synchronously rather than async.

Expectation

Just for test purposes, we are hitting an internal url, which does a 5 second sleep. With a concurrency of 10 we expect that all 10 requests will initially be queued and send to the server almost simultaneously, where they will wait for 5 seconds, and will then almost all of those will finish nearly at the same time. Which would make the guzzle client to pick up 10 new requests from iterator and so on.

Code

    $iterator = function() {
        $index = 0;
        while (true) {
            $client = new Client(['timeout'=>20]);
            $url = 'http://localhost/wait/5' . $index++;
            $request = new Request('GET',$url, []);
            echo "Queuing $url @ " . (new Carbon())->format('Y-m-d H:i:s') . PHP_EOL;
            yield $client
                ->sendAsync($request)
                ->then(function(Response $response) use ($request) {
                    return [$request, $response];
                });
        }
    };

    $promise = \GuzzleHttp\Promise\each_limit(
        $iterator(),
        10,  /// concurrency,
        function($result, $index) {
            /** GuzzleHttp\Psr7\Request $request */
            list($request, $response) = $result;
            echo (string) $request->getUri() . ' completed '.PHP_EOL;
        },
        function(RequestException $reason, $index) {
            // left empty for brevity
        }
    );
    $promise->wait();

Actual Results

We find that that Guzzle never made a second request until the first one is finished. and so on.

Queuing http://localhost/wait/5/1 @ 2017-09-01 17:15:28
Queuing http://localhost/wait/5/2 @ 2017-09-01 17:15:28
Queuing http://localhost/wait/5/3 @ 2017-09-01 17:15:28
Queuing http://localhost/wait/5/4 @ 2017-09-01 17:15:28
Queuing http://localhost/wait/5/5 @ 2017-09-01 17:15:28
Queuing http://localhost/wait/5/6 @ 2017-09-01 17:15:28
Queuing http://localhost/wait/5/7 @ 2017-09-01 17:15:28
Queuing http://localhost/wait/5/8 @ 2017-09-01 17:15:28
Queuing http://localhost/wait/5/9 @ 2017-09-01 17:15:28
Queuing http://localhost/wait/5/10 @ 2017-09-01 17:15:28
http://localhost/wait/5/1 completed
Queuing http://localhost/wait/5/11 @ 2017-09-01 17:15:34
http://localhost/wait/5/2 completed
Queuing http://localhost/wait/5/12 @ 2017-09-01 17:15:39
http://localhost/wait/5/3 completed
Queuing http://localhost/wait/5/13 @ 2017-09-01 17:15:45
http://localhost/wait/5/4 completed
Queuing http://localhost/wait/5/14 @ 2017-09-01 17:15:50 

OS / Version information

  • Ubuntu
  • PHP/7.1.3
  • GuzzleHttp/6.2.1
  • curl/7.47.0

The issue could be with \GuzzleHttp\Promise\each_limit .. which perhaps does not initiates or resolves the promise fast enough. It may be possible that we have to trick that into ticking externally.

Scalable
  • 1,550
  • 4
  • 16
  • 29

1 Answers1

25

In the example code, you're creating a new GuzzleHttp\Client instance for every request you want to make. This might not seem important, however, during instantiation of GuzzleHttp\Client it will set a default handler if none is provided. (This value is then passed down to any request being sent through the Client, unless it is overridden.)

Note: It determines the best handler to use from this function. Though, it'll most likely end up defaulting to curl_mutli_exec.

What's the importance of this? It's the underlying handler that is responsible for tracking and executing multiple requests at the same time. By creating a new handler every time, none of your requests are properly being grouped up and ran together. For some more insight into this take a gander into the curl_multi_exec docs.

So, you kind of have two ways of dealing with this:

Pass through the client through to the iterator:

$client = new GuzzleHttp\Client(['timeout' => 20]);

$iterator = function () use ($client) {
    $index = 0;
    while (true) {
        if ($index === 10) {
            break;
        }

        $url = 'http://localhost/wait/5/' . $index++;
        $request = new Request('GET', $url, []);

        echo "Queuing $url @ " . (new Carbon())->format('Y-m-d H:i:s') . PHP_EOL;

        yield $client
            ->sendAsync($request)
            ->then(function (Response $response) use ($request) {
                return [$request, $response];
            });

    }
};

$promise = \GuzzleHttp\Promise\each_limit(
    $iterator(),
    10,  /// concurrency,
    function ($result, $index) {
        /** @var GuzzleHttp\Psr7\Request $request */
        list($request, $response) = $result;
        echo (string)$request->getUri() . ' completed ' . PHP_EOL;
    }
);
$promise->wait();

or create the handler elsewhere and pass it to the client: (Though I'm not sure why you'd do this, but it's there!)

$handler = \GuzzleHttp\HandlerStack::create();

$iterator = function () use ($handler) {
    $index = 0;
    while (true) {
        if ($index === 10) {
            break;
        }

        $client = new Client(['timeout' => 20, 'handler' => $handler])
        $url = 'http://localhost/wait/5/' . $index++;
        $request = new Request('GET', $url, []);

        echo "Queuing $url @ " . (new Carbon())->format('Y-m-d H:i:s') . PHP_EOL;

        yield $client
            ->sendAsync($request)
            ->then(function (Response $response) use ($request) {
                return [$request, $response];
            });

    }
};

$promise = \GuzzleHttp\Promise\each_limit(
    $iterator(),
    10,  /// concurrency,
    function ($result, $index) {
        /** @var GuzzleHttp\Psr7\Request $request */
        list($request, $response) = $result;
        echo (string)$request->getUri() . ' completed ' . PHP_EOL;
    }
);
$promise->wait();
Adam Lavin
  • 753
  • 7
  • 15
  • 1
    The reason creating a new client is important is because requirements dictate the use of proxy server and other request specific parameters. Meaning that, every request may be a completely different request, including url, header and method. If I create a single client instance, I do not know what it will share with other requests. With that logic, probably the second solution you proposed is preferable. I will try both of the solutions proposed above and report back. Too bad I am at home right now .. and don't have the setup to try the above. – Scalable Sep 03 '17 at 15:00
  • 1
    You should be able to set that on your request with the `proxy` option, http://docs.guzzlephp.org/en/stable/request-options.html#proxy. – Adam Lavin Sep 03 '17 at 15:04
  • 2
    *By creating a new handler every time, none of your requests are properly being grouped up* **Why is this a problem?** Why can't they run together in separate groups? Does that mean that if I'm making only one request, it'll always be run synchronously? – Draex_ Apr 03 '19 at 07:26
  • 5
    I got here to find out that in order to run async, php-curl is a must. I didn't had it installed and I spent 2 days trying to find the problem. – StR Jul 10 '19 at 16:58
  • Thanks for that answer @AdamLavin. I think I need this too for my case. I just wonder if I can even use it: I have built a proxy server, and whenever a request hits it it just forwards the request to a REST API (adds credentials) and passes back the response. Now locally my server crashes if I have too many requests at the same time, so I would have to use promises, I assume. But the difference is, that whenever I executed my php code where I create Clients and requests, they don't know if there are other requests around... so how would you tackle that problem? – Merc Sep 05 '20 at 22:25
  • Maybe this even relates to @Scalable s comment, but I am too inexperienced to really make sense of all that... – Merc Sep 05 '20 at 22:25
  • 2
    @AdamLavin providing links to specific files (and even lines!) in master branch is not a good idea - it is almost certainly guaranteed that they'll become outdated as time goes by. a better option is to use revision hash instead of branch name – whyer Jan 21 '21 at 17:55
  • So this is useless because you now have to thread a `Client` instance through your whole app to make it work – chpio May 30 '22 at 11:54
  • Check PHP 8.1 - Fibers to support async – Jeffrey Nicholson Carré Nov 05 '22 at 22:24