11

I have been following this tutorial on how to use curl_multi. http://arguments.callee.info/2010/02/21/multiple-curl-requests-with-php/

I can't tell what I am doing wrong, but curl_multi_getcontent is returning null. It is suppose to return JSON. I know it is not the mysql call as I had it working with a while loop and standard curl_exec, but The page was taking too long to load. (I've changed some of the setopt details for security)

Relevant PHP Code snippet. I do close the while loop in the end.

$i = 0;
$ch = array();
$mh = curl_multi_init();
while($row = $result->fetch_object()){
   $ch[$i] = curl_init();
   curl_setopt($ch[$i], CURLOPT_CAINFO, 'cacert.pem');
   curl_setopt($ch[$i], CURLOPT_USERPWD, "$username:$password");
   curl_setopt($ch[$i], CURLOPT_RETURNTRANSFER, true); 
   curl_setopt($ch[$i], CURLOPT_URL, 'https://mysite.com/search/'.$row->username.'/');
   curl_multi_add_handle($mh, $ch[$i]);
   $i++;
}
$running = 0;
do {
    curl_multi_exec($mh, $running);
} while ($running > 0);
$result->data_seek(0);
$i = 0;
while ($row = $result->fetch_object()) {
    $data = curl_multi_getcontent($ch[$i]);
    $json_data = json_decode($data);
    var_dump($json_data);

EDIT

Here is the code that currently works, but causes the page to load too slowly

$ch = curl_init();
curl_setopt($ch, CURLOPT_CAINFO, 'cacert.pem');
curl_setopt($ch, CURLOPT_USERPWD, "$username:$password");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); 
while($row = $result->fetch_object()){
   curl_setopt($ch, CURLOPT_URL, 'https://mysite.com/search/'.$row->username.'/');
   $data = curl_exec($ch);
   $json_data = json_decode($data);
   var_dump($json_data);
}
Blake Plumb
  • 6,779
  • 3
  • 33
  • 54
  • 1
    CURLOPT_USERPWD assumes you have http password authentication on the website. If it is setup that way, change your code to curl_setopt($ch[$i], CURLOPT_USERPWD, "$username:$password"); with double quotes. – velcrow Sep 16 '13 at 15:41
  • 1
    have you var_dumped `$ch[$i]` too see if that's containing what it should? – NDM Sep 16 '13 at 15:44
  • @velcrow just noticed that error myself, right before your comment. I thought it would fix it, but it didn't. :( – Blake Plumb Sep 16 '13 at 15:44
  • @NDM when I var_dump `$ch[$i]` I get `resource(#) of type (curl)` – Blake Plumb Sep 16 '13 at 16:15
  • I think that you increment $i in the second while, right? If you do, try to use `var_dump(curl_errno($ch[$i]));` inside the second while to see what errors you get. – dcro Sep 16 '13 at 16:22
  • 3
    And also try `var_dump(curl_error($ch[$i]));` in conjunction with `curl_errno` – dcro Sep 16 '13 at 16:31
  • @dcro the `curl_errno` returned `int(0)` while `curl_error` returned `string(60) "Unknown SSL protocol error in connection to mysite.com:443 "` I am defining the cacert.pem file so I don't understand why I am getting this error. – Blake Plumb Sep 16 '13 at 16:38
  • @dcro Setting `curl_setopt($ch[$i], CURLOPT_SSL_VERIFYPEER, false);` fixes the issue and my page loads fine. Can `curl_multi_exec` not accept the `curlopt_cainfo` option? – Blake Plumb Sep 16 '13 at 16:54
  • 1
    Ok, this is very weird. There have been bugs in the past with the curl lib, SSL and `curl_multi_init` but those were generally memory leaks. I don't think the `CURLOPT_CAINFO` option is the problem but the inability of the curl lib to negotiate a SSL protocol. It might help to assign the ciphers using `CURLOPT_SSL_CIPHER_LIST` but it's no guarantee. Also, double check the URL sent in `CURLOPT_URL` as a simple typo there could result in this error if your DNS provider redirects failed DNS requests to another host. – dcro Sep 16 '13 at 16:59
  • Its possible CURLOPT_SSL_VERIFYPEER triggers additional network calls to check for certificate and intermediate certificate revocation via OCSP or CRL. The certificate pem file on https://mysite:443 would contain the OCSP and/or CRL urls in them. Hopefully curl caches the CRL or OCSP response, so subsequent calls don't keep refetching the CRL, but you never know... CRLs can be as large as 10MB, but a good CA like DigiCert will try to keep its CRLs small through partitioning. Can you post the certificate on https://mysite:443/ ? – velcrow Sep 16 '13 at 23:20

6 Answers6

2

I'm wondering:

$i = 0;
while ($row = $result->fetch_object()) {
    $data = curl_multi_getcontent($ch[$i]);
    $json_data = json_decode($data);
    var_dump($json_data);

Are you forgetting to increment $i? If so, you already grabbed the content for $ch[0], and then you call curl_multi_getcontent again.

Also, I've written a blog post covering concurrent requests with PHP's cURL extension, and it contains a general function for curl multi requests. You could call this function in the following manner:

$responses = multi([
    $requests = [
        ['url' => 'https://example.com/search/username1/'],
        ['url' => 'https://example.com/search/username2/'],
        ['url' => 'https://example.com/search/username3/']
    ]
    $opts = [
        CURLOPT_CAINFO => 'cacert.pem',
        CURLOPT_USERPWD => "username:password"
    ]
]);

Then, you cycle through the responses array:

foreach ($responses as $response) {
    if ($response['error']) {
        // handle error
        continue;
    }
    // check for empty response
    if ($response['data'] === null) {
        // examine $response['info']
        continue;
    }
    // handle data
    $data = json_decode($response['data']);
    // do something
}

Using this function, you could make a simple test of accessing https sites using the following call:

multi(
    $requests = [
        'google' => ['url' => 'https://www.google.com'],
        'linkedin' => ['url'=> 'https://www.linkedin.com/']
    ],
    $opts = [
        CURLOPT_CAINFO => '/path/to/your/cacert.pem',
        CURLOPT_SSL_VERIFYPEER => true
    ]
);
Yahya Hussein
  • 8,767
  • 15
  • 58
  • 114
AdamJonR
  • 4,703
  • 2
  • 22
  • 24
  • I do increment the $i. The problem is an `Unknown SSL protocol error` Do you have any idea why `curl_setopt($ch[$i], CURLOPT_CAINFO, 'cacert.pem');` doesn't work with `curl_multi_exec` but does work with the regular `curl_exec` – Blake Plumb Sep 23 '13 at 14:23
  • I've added another example call to the function. This works on my system. Can you try using the function from the blog post and making this call on your system. It's a very basic test just grabbing two https sites (google and linkedin) concurrently. If this works, we can work from there. – AdamJonR Sep 23 '13 at 18:27
  • This is the best answer that has been posted this far. – Blake Plumb Sep 24 '13 at 03:04
  • @BlakePlumb Were you able to try this and see what happened on your setup? The function also returns the info for the request, which could be useful, too. – AdamJonR Sep 24 '13 at 04:42
1

I see that your execution loop is different from the one that is adviced in PHP documentation:

do {
  $mrc = curl_multi_exec($mh, $active);
} while ($mrc == CURLM_CALL_MULTI_PERFORM);

Note that in while the function return is compared, not the second parameter.

Edit: Thanks to Adam's comment I have tested both syntaxes and see that they are equal and asynchronous. Here is a working example of asynchronous multi-request with getting content into variable:

<?php
$ch = array();
$mh = curl_multi_init();
$total = 100;

echo 'Start: ' . microtime(true) . "\n";

for ($i = 0; $i < $total; $i++) {
    $ch[$i] = curl_init();
    curl_setopt($ch[$i], CURLOPT_URL, 'http://localhost/sleep.php?t=' . $i);
    curl_setopt($ch[$i], CURLOPT_HEADER, 0);
    curl_setopt($ch[$i], CURLOPT_RETURNTRANSFER, true);

    curl_multi_add_handle($mh, $ch[$i]);
}

$active = null;
do {
    $mrc = curl_multi_exec($mh, $active);
    usleep(100); // Maybe needed to limit CPU load (See P.S.)
} while ($active);

foreach ($ch AS $i => $c) {
    $r = curl_multi_getcontent($c);
    var_dump($r);
    curl_multi_remove_handle($mh, $c);
}

curl_multi_close($mh);

echo 'End: ' . microtime(true) . "\n";

And testing file sleep.php:

<?php
$start = microtime(true);

sleep( rand(3, 5) );

$end = microtime(true);

echo $_GET['t'], ': ', $start, ' - ', $end, ' - ', ($end - $start);
echo "\n";

P.S. Initial idea of using usleep inside a loop was to pause it a bit and thus reduce number of operations while cUrl waits for response. And at the beginning it seemed to work that way. But last tests with top showed a minimal difference in CPU load (17% with usleep versus 20% without it). So, I do not know whether to use it or not. Maybe tests on real server would show another results.

Edit 2: I have tested my code with making a request to password protected HTTPS page (CURLOPT_CAINFO and CURLOPT_USERPWD equal to those in the question). It works as expected. Probably there is a bug in your version of PHP or cURL. My versions are "PHP Version 5.3.10-1ubuntu3.8" and 7.22.0. They have no problems.

Alex
  • 1,605
  • 11
  • 14
  • 1
    You can process curl_multi* functions using a variety of means. The simplest way is to wait for all of the requests to finish and then handle the responses all at once, which is what his example is doing. – AdamJonR Sep 23 '13 at 17:55
0

Use $running = null; in place of $running = 0;.

As per the links:

  1. multiple-curl-requests-with-php

  2. http://www.php.net/manual/en/function.curl-multi-exec.php

In both these cases, variable has been defined as NULL, this is because

curl_multi_exec ( resource $mh , int &$still_running )

The second argument is reference to a variable.

Also, you might find this useful: php single curl works but multi curl doesn't work

Community
  • 1
  • 1
aaron
  • 697
  • 4
  • 11
0

curl_multi_getcontent() will return null, with no error, if any of:

  • CURLOPT_RETURNTRANSFER is not set in the request
  • You are using a custom write function using CURLOPT_WRITEFUNCTION

There are other situations where it may return null with an error, which you can check using curl_errno() and curl_error().

thomasrutter
  • 114,488
  • 30
  • 148
  • 167
-1

curl_multi_exec executes multithreded HTTP requests and requests may be completed not in order you have added them to multihandler$mh. To get response of completed requests you should use curl_multi_info_read function. You can read more about it on php.net http://php.net/manual/ru/function.curl-multi-info-read.php

max
  • 2,757
  • 22
  • 19
-1

Did you set CURLOPT_SSL_VERIFYPEER to true?

EliteTech
  • 386
  • 3
  • 13