1

I'm trying to download a biggish csv file from a website. I've seen the suggestion to use file_get_contents and file_put_contents but due to the file size I opted to use cURL and its file handle option so the data goes directly to file rather that into PHP memory.

However I seem to be getting a weird error where by the entire file doesn't always download and I'm unsure how to verify that the entire file has been downloaded without using the response headers which I'm not sure how to access when using the file handle option of cURL.

This is what I am using at the minute to get the file.

    $ch = curl_init('*FILE_TO_GET*');
    $fp = fopen('*WHERE_TO_FILE*', 'wb');
    curl_setopt($ch, CURLOPT_FILE, $fp);
    curl_setopt($ch, CURLOPT_HEADER, 0);
    curl_exec($ch);
    curl_close($ch);
    fclose($fp);

The files is approximately 3.8 MB at the moment however it has the potential to get much larger.

James
  • 2,609
  • 3
  • 20
  • 27
  • http://www.google.com/search?q=php+curl+response+header – Karoly Horvath Feb 28 '13 at 11:17
  • Try `curl_getinfo($ch)` and check if all the parameters are fine. – SilentAssassin Feb 28 '13 at 11:18
  • @KarolyHorvath Thanks but I already know how to use google. All examples I can find require the file to be loaded into memory to get the headers. – James Feb 28 '13 at 11:22
  • @SilentAssassin I though getinfo only related to the request not the response? Which parameters should I check? – James Feb 28 '13 at 11:25
  • @James: Get length + then save to file – Karoly Horvath Feb 28 '13 at 11:28
  • @KarolyHorvath I don't want to buffer the entire file in memory just so I can get the headers. – James Feb 28 '13 at 11:32
  • could you possibly use a hash of the file and compare the two? – SchautDollar Feb 28 '13 at 11:32
  • @SchautDollar Not unless a hash is provided for the file I'm trying to download in the response headers which I'm currently unsure how to get without putting the file into memory. – James Feb 28 '13 at 11:35
  • @James Your scenario is different and I've not downloaded a file so I am unaware what parameter to check. I used to checked the `http_code` from `curl_getinfo` and error info from `curl_error` as I was extracting content from a web page to check if the the content of the page was valid. – SilentAssassin Feb 28 '13 at 11:45

1 Answers1

1

Without knowing the size of the file to be downloaded it is not possible to determine whether it is completely downloaded.

As such, you need to get the header information for the file. The file headers are usually quite small, and given you are downloading a large file, it's worth the effort. The headers will have the file size along with some other bits of information.

As you have not tried getting headers before, try this neat little function that determines the file size at Php Remote File Size without downloading file.

The example there details how to download the headers and to extract the size info. Then you can simply compare this value with the downloaded file size and viola.

Community
  • 1
  • 1
Kami
  • 19,134
  • 4
  • 51
  • 63