1

In my scenario I could be required to make over 100 curl requests to get information that I need. There's no way to get this information beforehand, and I don't have access to the server that I will be making the requests to. My plan is to use curl_multi_init(). Each response will come in json. The problem is that I need to receive the information in the order that I placed it otherwise I won't know where everything goes after the response comes back. How do I solve this problem.

Drew
  • 13
  • 7

2 Answers2

1

Obviously, since the requests are asynchronous, you cannot predict the order in which the responses will arrive. Therefore, in your design, you must provide for each request to include "some random bit of information" – a so-called nonce – which each client will somehow be obliged to return to you verbatim.

Based upon this "nonce," you will then be able to pair each response to the request which originated it – and to discard any random bits of garbage that wander in "out of the blue."

Otherwise, there is no(!) solution to your problem.

Mike Robinson
  • 8,490
  • 5
  • 28
  • 41
  • Can you provide any examples on how to send a nonce. I've been looking around and I can't find out how to send it, and even if I sent it there's no way to know if the server would send it back or exclude it with the response. – Drew Nov 07 '18 at 01:36
  • By default, CURL requests in PHP are not executed async. Therefore, he can process the return (save it etc) and then go to the next. – Oliver M Grech Nov 07 '18 at 01:40
  • ..and more detail for you in this post, maybe it helps https://stackoverflow.com/questions/36171222/async-curl-request-in-php – Oliver M Grech Nov 07 '18 at 01:41
  • @OliverMGrech I should have mentioned in my post to begin with that I was using ```curl_multi_init()``` I have edited my question to reflect this. – Drew Nov 07 '18 at 02:07
  • I don't think you need a nonce for this? There is no secret to keep. Just use an incrementing variable. – bishop Nov 07 '18 at 02:35
  • @bishop can you explain in more detail. If I send an variable along with the request how can I be sure it will be returned in the response. – Drew Nov 07 '18 at 02:43
  • I see what you're saying now. You don't need a counter of any kind: just keep the handles for each curl that's part of the multi in an array and compare the responding handles with the stored ones: you can trust curl to give you back the handles you gave it. – bishop Nov 07 '18 at 02:48
  • @bishop thank you for helping. That makes this seem doable now. How can I tie the handles to the request so that I can compare them when the response comes? – Drew Nov 07 '18 at 03:08
  • You basically "do this any way you like – any way you *can."* Somehow, you need to send the nonce-value out as part of your request, and somehow arrange for that nonce to somehow be returned with every reply. You're going to have to arrange that on the *host* side. The purpose of the nonce is to identify that "this reply is associated with the request which has this nonce value." (And sometimes, a reply will wander in bearing a nonce-value that you no longer care about.) – Mike Robinson Nov 07 '18 at 16:00
1

When you get the handles back from curl_multi_info_read, you can compare those handles against your keyed list, then of course use the key to know where your response goes. Here's the direct implementation, based on a model I use for a scraper:

// here's our list of URL, in the order we care about
$easy_handles['google']     = curl_init('https://google.com/');
$easy_handles['bing']       = curl_init('https://bing.com/');
$easy_handles['duckduckgo'] = curl_init('https://duckduckgo.com/');

// our responses will be here, keyed same as URL list
$responses = [];

// here's the code to do the multi-request -- it's all boilerplate
$common_options = [ CURLOPT_FOLLOWLOCATION => true, CURLOPT_RETURNTRANSFER => true ];
$multi_handle = curl_multi_init();
foreach ($easy_handles as $easy_handle) {
    curl_setopt_array($easy_handle, $common_options);
    curl_multi_add_handle($multi_handle, $easy_handle);
}
do {
    $status = curl_multi_exec($multi_handle, $runCnt);
    assert(CURLM_OK === $status);
    do {
        $status = curl_multi_select($multi_handle, 2/*seconds timeout*/);
        if (-1 === $status) usleep(10); // reported bug in PHP
    } while (0 === $status);
    while (false !== ($info = curl_multi_info_read($multi_handle))) {
        foreach ($easy_handles as $key => $easy_handle) { // find the response handle
            if ($info['handle'] === $easy_handle) {       // from our list
                if (CURLE_OK === $info['result']) {
                    $responses[$key] = curl_multi_getcontent($info['handle']);
                } else {
                    $responses[$key] = new \RuntimeException(
                        curl_strerror($info['result'])
                    );
                }
            }
        }
    }
} while (0 < $runCnt);

Most of this is boilerplate machinery to do the multi fetch. The lines that target your specific question are:

foreach ($easy_handles as $key => $easy_handle) { // find the response handle
    if ($info['handle'] === $easy_handle) {       // from our list
        if (CURLE_OK === $info['result']) {
            $responses[$key] = curl_multi_getcontent($info['handle']);

Loop over your list comparing the returned handle against each stored handle, then use the corresponding key to fill in your response.

bishop
  • 37,830
  • 11
  • 104
  • 139
  • ... which is also a fine way to do it, as long as you know that all of the CURL requests can be launched at the same time. Each one is passing through a distinct socket-connection. So, this is indeed a great solution. *(Upvoted.)* But problems can arise if there are *too many at once.* – Mike Robinson Nov 07 '18 at 16:03
  • 1
    @bishop This is an amazing answer. I selected best answer for this. Thank you very much. – Drew Nov 08 '18 at 08:39