2

I can't get my download script to work with external files, the file will download but is corrupted/not working. I think it's because I can't get the filesize of the external file with filesize() function.

This is my script:

function getMimeType($filename){
    $ext = pathinfo($filename, PATHINFO_EXTENSION);
    $ext = strtolower($ext);

    $mime_types=array(
        "pdf" => "application/pdf",
        "txt" => "text/plain",
        "html" => "text/html",
        "htm" => "text/html",
        "exe" => "application/octet-stream",
        "zip" => "application/zip",
        "doc" => "application/msword",
        "xls" => "application/vnd.ms-excel",
        "ppt" => "application/vnd.ms-powerpoint",
        "gif" => "image/gif",
        "png" => "image/png",
        "jpeg"=> "image/jpg",
        "jpg" =>  "image/jpg",
        "php" => "text/plain",
        "csv" => "text/csv",
        "xlsx" => "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
        "pptx" => "application/vnd.openxmlformats-officedocument.presentationml.presentation",
        "docx" => "application/vnd.openxmlformats-officedocument.wordprocessingml.document"
    );

    if(isset($mime_types[$ext])){
        return $mime_types[$ext];
    } else {
        return 'application/octet-stream';
    }
}

$path = "http://www.example.com/file.zip";

/* Does not work on external files
// check file is readable or not exists
if (!is_readable($path))
    die('File is not readable or does not exists!');
*/

$file_headers = @get_headers($path);
if($file_headers[0] == 'HTTP/1.1 404 Not Found') {
    echo "Files does not exist.";
} else {

$filename = pathinfo($path, PATHINFO_BASENAME);

// get mime type of file by extension
$mime_type = getMimeType($filename);

// set headers
header('Pragma: public');
header('Expires: -1');
header('Cache-Control: public, must-revalidate, post-check=0, pre-check=0');
header('Content-Transfer-Encoding: binary');
header("Content-Disposition: attachment; filename=\"$filename\"");
header("Content-Length: " . filesize($path));
header("Content-Type: $mime_type");
header("Content-Description: File Transfer");

// read file as chunk
if ( $fp = fopen($path, 'rb') ) {
    ob_end_clean();

    while( !feof($fp) and (connection_status()==0) ) {
        print(fread($fp, 8192));
        flush();
    }

    @fclose($fp);
    exit;
}

}

I believe it can be done with cURL - but my knowledge is lacking. What I would like to know:

  • How do I check if the file exist and how do I get the filesize with cURL?

  • Would it be better just to use cURL and forget about fopen?

  • Is the headers set correctly?

Any advice is much appreciated!

2by
  • 1,083
  • 5
  • 22
  • 39

6 Answers6

1

The problem comes from your content-length that gets set to 0. Since you already have the content-length from the get_headers call, simply change the following line:

header("Content-Length: " . filesize($path));

to:

header($file_headers[8]);

Note that the content of $file_headers might vary (8 worked for me), check the manual for details, or execute a print_r($file_headers) to see what you get in there.

If you don't care about the content-length header, simply comment it out, most browsers should handle this without any problem.

Julien
  • 2,217
  • 2
  • 28
  • 49
  • You do realize that setting the executing scripts headers has not that much impact over the query he is trying to do to an external url via fopen. – khael Nov 21 '14 at 05:47
  • His code works, there is nothing wrong with using fopen, his only problem comes from the fact that he is passing 0 as the content-length header. – Julien Nov 21 '14 at 15:35
  • But my question was about the headers set for what `the current` php script will return (meaning, the use of `header()`) and it's influence over the `query` he was doing with fopen on an external url. I hope I clarified it. – khael Nov 21 '14 at 15:39
1

this code is work fine to download from url :

set_time_limit(0);

//File to save the contents to
$fp = fopen ('r.jpg', 'w+');

$url = "http://cgr.ir/test.jpg";

//Here is the file we are downloading, replace spaces with %20
$ch = curl_init(str_replace(" ","%20",$url));

curl_setopt($ch, CURLOPT_TIMEOUT, 50);

//give curl the file pointer so that it can write to it
curl_setopt($ch, CURLOPT_FILE, $fp);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);

$data = curl_exec($ch);//get curl response

//done
curl_close($ch);
?>
reza ostadi
  • 197
  • 1
  • 10
0

Function:

<?php
/**
 * Returns the size of a file without downloading it, or -1 if the file
 * size could not be determined.
 *
 * @param $url - The location of the remote file to download. Cannot
 * be null or empty.
 *
 * @return The size of the file referenced by $url, or -1 if the size
 * could not be determined.
 */
function curl_get_file_size( $url ) {
  // Assume failure.
  $result = -1;

  $curl = curl_init( $url );

  // Issue a HEAD request and follow any redirects.
  curl_setopt( $curl, CURLOPT_NOBODY, true );
  curl_setopt( $curl, CURLOPT_HEADER, true );
  curl_setopt( $curl, CURLOPT_RETURNTRANSFER, true );
  curl_setopt( $curl, CURLOPT_FOLLOWLOCATION, true );
  curl_setopt( $curl, CURLOPT_USERAGENT, get_user_agent_string() );

  $data = curl_exec( $curl );
  curl_close( $curl );

  if( $data ) {
    $content_length = "unknown";
    $status = "unknown";

    if( preg_match( "/^HTTP\/1\.[01] (\d\d\d)/", $data, $matches ) ) {
      $status = (int)$matches[1];
    }

    if( preg_match( "/Content-Length: (\d+)/", $data, $matches ) ) {
      $content_length = (int)$matches[1];
    }

    // http://en.wikipedia.org/wiki/List_of_HTTP_status_codes
    if( $status == 200 || ($status > 300 && $status <= 308) ) {
      $result = $content_length;
    }
  }

  return $result;
}
?>

Function call:

$file_size = curl_get_file_size( "http://stackoverflow.com/questions/2602612/php-remote-file-size-without-downloading-file" );
0

Try using something like this:

function get_data($url) 
{
    $ch = curl_init();
    $timeout = 5;
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
    $data = curl_exec($ch);
    curl_close($ch);

    return $data;
}

Unfortunately your lack of detail about your specific query or files rendered me unable to come up with more exact code to match your situation. And the above (or below) curl_get_file_size will help you with the size in case you ever need it.

khael
  • 2,600
  • 1
  • 15
  • 36
0

You can try this process as well, I am assuming that your source url is $sourceUrl and destination/ path to save file is $destinationPath

$destFilename = 'my_file_name.ext';
$destinationPath = 'your/destination/path/'.$destFilename;

if(ini_get('allow_url_fopen')) {                                
    if( ! @file_put_contents($destinationPath, file_get_contents($sourceUrl))){
        $http_status = $http_response_header[0];
        sprintf('%s encountered while attempting to download %s',$http_status, $sourceUrl );
        break;
    }
} elseif(function_exists('curl_init')) {
    $ch = curl_init($sourceUrl);
    $fp = fopen($destinationPath, "wb");

    $options = array(
        CURLOPT_FILE => $fp,
        CURLOPT_HEADER => 0,
        CURLOPT_FOLLOWLOCATION => 1,
        CURLOPT_TIMEOUT => 120); // in seconds

    curl_setopt_array($ch, $options);
    curl_exec($ch);
    $http_status = intval(curl_getinfo($ch, CURLINFO_HTTP_CODE));
    curl_close($ch);
    fclose($fp);

    //delete the file if the download was unsuccessful
    if($http_status != 200) {
        unlink($destinationPath);
        sprintf('HTTP status %s encountered while attempting to download %s', $http_status, $sourceUrl );

    }
} else {    
    sprintf('Looks like %s is off and %s is not enabled. No images were imported.', '<code>allow_url_fopen</code>', '<code>cURL</code>'  );
    break;
}

You can use curl_getinfo($ch, CURLINFO_CONTENT_TYPE); in case of curl to get the file info and use it as per your requirement.

Ram Sharma
  • 8,676
  • 7
  • 43
  • 56
0

IMHO it is a good idea not to rely on php curl module availability. Your snippet works with a little modification:

First change

$file_headers = @get_headers($path);

to

$file_headers = @get_headers($path,1);

to get named array keys (see php reference).

With this modification the http status code still comes in $file_headers[0] but you'll get some more and useful data which can be passed thru (validation recommended): Content-Length and even Content-Type (which allows you waiving your approach of mime-type detection upon file suffix).

Change

header("Content-Length: " . filesize($path));

to

header("Content-Length: " . $file_headers['Content-Length']);

and

header("Content-Type: $mime_type");

to

header("Content-Type: " . $file_headers['Content-Type']);

Even if your "path" is a trusted source you might want to add some validation as you should not trust exernal data being of the kind you expect.

labemi
  • 46
  • 3