I found a script to have asynchronous response from curl_multi requests:
function rolling_curl($urls, $callback, $custom_options = null) {
// make sure the rolling window isn't greater than the # of urls
$rolling_window = 100;
$rolling_window = (sizeof($urls) < $rolling_window) ? sizeof($urls) : $rolling_window;
$master = curl_multi_init();
$curl_arr = array();
// add additional curl options here
$std_options = array(CURLOPT_RETURNTRANSFER => true,
CURLOPT_FOLLOWLOCATION => true,
CURLOPT_MAXREDIRS => 5);
$options = ($custom_options) ? ($std_options + $custom_options) : $std_options;
$hosts = array();
// start the first batch of requests
for ($i = 0; $i < $rolling_window; $i++) {
$hosts[$i]=parse_url($urls[$i], PHP_URL_HOST);
$ch = curl_init();
$options[CURLOPT_URL] = $urls[$i];
curl_setopt_array($ch,$options);
curl_multi_add_handle($master, $ch);
}
do {
while(($execrun = curl_multi_exec($master, $running)) == CURLM_CALL_MULTI_PERFORM);
if($execrun != CURLM_OK)
break;
// a request was just completed -- find out which one
while($done = curl_multi_info_read($master)) {
$info = curl_getinfo($done['handle']);
if ($info['http_code'] == 200) {
$output = curl_multi_getcontent($done['handle']);
// request successful. process output using the callback function.
$id = array_search(parse_url($info['url'], PHP_URL_HOST),$hosts);
$callback($output,true, $id);
// start a new request (it's important to do this before removing the old one)
if(@$u = $urls[$i++]){
$ch = curl_init();
$options[CURLOPT_URL] = $u; // increment i
curl_setopt_array($ch,$options);
curl_multi_add_handle($master, $ch);
}
// remove the curl handle that just completed
curl_multi_remove_handle($master, $done['handle']);
} else {
$callback("",false);
}
}
} while ($running);
curl_multi_close($master);
return true;
}
This is already edited by me because what I was trying to do was keeping the index of the url array input on callback function. For example if input array is
$urls = array(0=> $url0, 1=> $url1, ...);
then I want that when curl request on url1 is completed, $callback is triggered with $id = 1 so that I know how to manage data returned, because I know it is url1. To do this I thought of finding the curl url in the urls array, but sometimes curl follow redirect, so it's not always working. Then I thought of comparing the host (fortunately every url has a different host in my case) and it works: I create a host array when creating curl handles and when request is completed I find the index of the curl host into the array. This is not the best way to get what I want, because with urls with same host it wouldn't work and maybe there is a tricky and more easy way to do that. Which is the best way to get the url index? PS: my callback function is
function callback($output,$success,$id){...}
".htmlentities($output); to compare the $callback method with CURLOPT_WRITEFUNCTION method and they are different, $callback return full page, while CURLOPT_WRITEFUNCTION return only a part of it. – Franz Tesca Jul 20 '16 at 17:13