I have a website that tracks individual player's data for an online game. Everyday at the same time a cron is run that uses cURL to fetch each player's data from the game company's server (each player requires their own page to fetch). Previously I was looping through each player and creating their own cURL request at a time and storing the data - While this was a slow process, everything was working fine for weeks (doing anywhere from 500-1,000 players everyday).
As we gained more players the cron started to take too long to run so I rewrote it using ParallelCurl (cURL multi handling) about a week ago. It was set to open no more than 10 connections at a time and was running perfectly - doing about 3,000 pages in 3-4 minutes. I never noticed anything wrong until a day or two later I was randomly unable to connect to their servers (returning http code of 0). I thought I was permanently banned/blocked until about 1-2 hours later I could suddenly connect again. The block occurred several hours after the cron had run for the day - the only requests that were being made at the time were the occasional single file requests (that have been working fine and left untouched for months).
The past few days have all been like this. Cron runs fine, then sometime later (a few hours) I can't get a connection for an hour or two. Today I updated the cron to only open 5 connections at a time - everything worked fine until 5-6 hours later I couldn't connect for 2 hours.
I've done a ton of googling and can't seem to find anything useful. I'd guess that possibly a firewall is blocking my connection, but I'm really in over my head when it comes to anything like that. I am really clueless as to what is happening, and what I need to do to fix it. I'd be grateful for any help - even a guess or a just point in the right direction.
Note that I'm using a shared web host (HostGator). 2 days ago I submitted a ticket and made a post on their forums, I also sent an e-mail to the company and have yet to see a single reply from anything.
--EDIT--
Here's my code to run the multiple requests using parallelcurl. The include has been left untouched and is the same as shown here
set_time_limit(0);
require('path/to/parallelcurl.php');
$plyrs = array();//normally an array of all the players i need to update
function on_request_done($content, $url, $ch, $player) {
$httpcode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
if ($httpcode !== 200) {
echo 'Could Not Find '.$player.'<br />';
return;
} else {//player was found, store in db
echo 'Updated '.$player.'<br />';
}
}
$max_requests = 5;
$curl_options = array(
CURLOPT_SSL_VERIFYPEER => FALSE,
CURLOPT_SSL_VERIFYHOST => FALSE,
CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.9) Gecko/20071025 Firefox/2.0.0.9',
);
$parallel_curl = new ParallelCurl($max_requests, $curl_options);
foreach ($plyrs as $p) {
$search_url = "http://website.com/".urlencode($p);
$parallel_curl->startRequest($search_url, 'on_request_done', $p);
usleep(300);//now that i think about it, does this actually do anything worthwhile positioned here?
}
$parallel_curl->finishAllRequests();
Here's the code I use to simply see if I can connect or not
$ch = curl_init();
$options = array(
CURLOPT_URL => $url,
CURLOPT_RETURNTRANSFER => true,
CURLOPT_HEADER => true,
CURLOPT_FOLLOWLOCATION => true,
CURLOPT_ENCODING => "",
CURLOPT_AUTOREFERER => true,
CURLOPT_CONNECTTIMEOUT => 120,
CURLOPT_TIMEOUT => 120,
CURLOPT_MAXREDIRS => 10,
CURLOPT_SSL_VERIFYPEER => false,
CURLOPT_SSL_VERIFYHOST => false,
);
curl_setopt_array( $ch, $options );
$response = curl_exec($ch);
$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
print_r(curl_getinfo($ch));
if ( $httpCode != 200 ){
echo "Return code is {$httpCode} \n"
.curl_error($ch);
} else {
echo "<pre>".htmlspecialchars($response)."</pre>";
}
curl_close($ch);
Running that when I'm unable to connect results in this:
Array ( [url] => http://urlicantgetto.com/ [content_type] => [http_code] => 0 [header_size] => 0 [request_size] => 121 [filetime] => -1 [ssl_verify_result] => 0 [redirect_count] => 0 [total_time] => 30.073574 [namelookup_time] => 0.003384 [connect_time] => 0.025365 [pretransfer_time] => 0.025466 [size_upload] => 0 [size_download] => 0 [speed_download] => 0 [speed_upload] => 0 [download_content_length] => -1 [upload_content_length] => 0 [starttransfer_time] => 30.073523 [redirect_time] => 0 ) Return code is 0 Empty reply from server