0

I need to use cURL with PHP to access a big file inside a Cloud storage. The storage URL change according to request. The PHP script is using a library to grant access using cURL and return the file.

But, HTTP request is throwing a 500 status because PHP is failing with the error:

PHP Fatal error:  Allowed memory size of 134217728 bytes exhausted (tried to allocate 133700011 bytes)

So, How can I properly redirect the requests? The field Location: is not working too because it misses the extra parameter inside the HTTP header.

Here is the script:

function fetch($url, $cookie=null) {
    $ch =  curl_init();

    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch, CURLOPT_AUTOREFERER, true);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
    curl_setopt($ch, CURLINFO_HEADER_OUT, true);

    if ($cookie) {
        curl_setopt($ch, CURLOPT_COOKIE, $cookie);
    }

    $result = curl_exec($ch);

    curl_close($ch);

    return $result;
}

Any hints?

jcfaracco
  • 853
  • 2
  • 6
  • 21
  • All the major cloud storage providers I'm aware of permit the creation of a short-term link to a private file, which is likely to be a better approach than proxying the contents of the file through your server. See https://docs.aws.amazon.com/aws-sdk-php/v3/guide/service/s3-presigned-url.html for how S3 handles this. – ceejayoz Dec 06 '17 at 18:49
  • @ceejayoz cloudatcost.com doesn't – hanshenrik Dec 06 '17 at 19:00
  • Check this answer: https://stackoverflow.com/questions/38417350/php-curl-realtime-proxy-stream-file/38418060#38418060 – drew010 Dec 07 '17 at 01:22

2 Answers2

0

curl_setopt($ch, CURLOPT_FILE, $stream_resource);

Where $stream_resource is something opened with fdopen(), say.

Cupcake Protocol
  • 661
  • 3
  • 10
  • nah, the default stream is STDOUT, which goes directly to the client, no need to set any custom value here :) – hanshenrik Dec 06 '17 at 18:53
  • Since the server is running out of memory, it sure seemed like it was reading into a variable. Guess my php curl knowledge is not so great. – Cupcake Protocol Dec 06 '17 at 18:55
  • 2
    you are correct, its reading everything into $result because he used CURLOPT_RETURNTRANSFER - but he shouldn't do that (for reasons i explained in my answer) – hanshenrik Dec 06 '17 at 18:58
  • @hanshenrik your suggestion solved my problem. :-) I removed the option and everything is working fine. – jcfaracco Dec 07 '17 at 19:08
0

your code buffers it all in memory before sending it to the client, that is both slow (because it doesn't start sending to the client until your download is 100% complete) and memory hungry (because you're putting the entire file in memory at the same time), instead send the file to the client in batches as its being downloaded, then its much faster and doesn't use much memory (because you can free the batch already send to the client). luckily, that is very easy with curl, and even the default mode, but you explicitly tell curl not to do it when you invoke CURLOPT_RETURNTRANSFER, so stop doing that. now curl_exec($ch) returns a bool instead of a string, and the transfer is sent to the client in batches, which should solve your memory problem and be much faster. also note that curl_setopt and curl_exec returns bool(false) if there was an error, those errors should not be ignored, thus i suggest you replace them with these error-catching wrappers:

function ecurl_setopt ( /*resource*/$ch , int $option , /*mixed*/ $value ):bool{
    $ret=curl_setopt($ch,$option,$value);
    if($ret!==true){
        //option should be obvious by stack trace
        throw new RuntimeException ( 'curl_setopt() failed. curl_errno: ' . return_var_dump ( curl_errno ($ch) ).'. curl_error: '.curl_error($ch) );
    }
    return true;
}
function ecurl_exec ( /*resource*/$ch):bool{
    $ret=curl_exec($ch);
    if($ret!==true){
        throw new RuntimeException ( 'curl_exec() failed. curl_errno: ' . return_var_dump ( curl_errno ($ch) ).'. curl_error: '.curl_error($ch) );
    }
    return true;
}


function return_var_dump(/*...*/){
    $args = func_get_args ();
    ob_start ();
    call_user_func_array ( 'var_dump', $args );
    return ob_get_clean ();
}
  • additionally, if the file you're downloading is compressible, and the target server supports it (the vast majority of web servers do), doing curl_setopt($ch,CURLOPT_ENCODING,''); will enable downloading with all compressions that curl was compiled with (usually gzip and deflate), which should make it even faster :)

  • also note, if you're going to run large number of big, slow, concurrent downloads this way, PHP is simply a bad choice of language performance-wise, each download takes up an entire php process (or in Apache mod_php's case, thread) until the download is complete, and there's usually a very limited amount of php processes allowed to run concurrently, which can block your entire website (i'd write the download proxy in Go myself, Goroutines would be near-perfect for this type of job)

hanshenrik
  • 19,904
  • 4
  • 43
  • 89