1

I am trying to download a ZIP file using cURL, from a given URL. I received an URL from a supplier where I should download a ZIP file. But everytime I try to download the ZIP file I get the page that says that I am not logged in.

The url where I should get the file from looks like this:

https://www.tyre24.com/nl/nl/user/login/userid/USERID/password/PASSWORD/page/L2V4cG9ydC9kb3dubG9hZC90L01nPT0vYy9NVFE9Lw==

Here you see that the USERID, and PASSWORD are variables that are filled in with the correct data. The strange thing is that if I enter the URL in my browser it seems to work, the zip file is getting downloaded.

But everytime I call that URL with cURL, I seem to get a incorrect login page. Could someone tell me what I am doing wrong?

It seems like that there is a redirect behind the given URL, that is why I have putted in the cURL call: curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);

Here is my code:

set_time_limit(0);

//File to save the contents to
$fp = fopen ('result.zip', 'w+');

$url = "https://www.tyre24.com/nl/nl/user/login/userid/118151/password/5431tyre24/page/L2V4cG9ydC9kb3dubG9hZC90L01nPT0vYy9NVFE9Lw==";

//Here is the file we are downloading, replace spaces with %20
$ch = curl_init(str_replace(" ","%20",$url));

curl_setopt($ch, CURLOPT_TIMEOUT, 50);

//give curl the file pointer so that it can write to it
curl_setopt($ch, CURLOPT_FILE, $fp);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);

$data = curl_exec($ch);//get curl response

//done
curl_close($ch);

Am I doing something wrong?

user3824329
  • 95
  • 1
  • 3
  • 13

2 Answers2

7

To download a zip file from the external source via CURL use one of the following approaches:

First approach:

function downloadZipFile($url, $filepath){
     $ch = curl_init($url);
     curl_setopt($ch, CURLOPT_HEADER, 1);
     curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
     curl_setopt($ch, CURLOPT_BINARYTRANSFER, 1);
     curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 0);
     $raw_file_data = curl_exec($ch);

     if(curl_errno($ch)){
        echo 'error:' . curl_error($ch);
     }
     curl_close($ch);

     file_put_contents($filepath, $raw_file_data);
     return (filesize($filepath) > 0)? true : false;
 }

downloadZipFile("http://www.colorado.edu/conflict/peace/download/peace_essay.ZIP", "result.zip");

A few comments:

  • to get data back from the remote source you have to set CURLOPT_RETURNTRANSFER option
  • instead of consequent calls of fopen ... fwite functions you can use file_put_contents which is more handy

And here is screenshot with result.zip which was downloaded a few minutes earlier using the above approach:

result

Second approach:

function downloadZipFile($url, $filepath){
     $fp = fopen($filepath, 'w+');
     $ch = curl_init($url);

     curl_setopt($ch, CURLOPT_RETURNTRANSFER, false);
     curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
     //curl_setopt( $ch, CURLOPT_SSL_VERIFYPEER, false );
     curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10);
     curl_setopt($ch, CURLOPT_FILE, $fp);
     curl_exec($ch);

     curl_close($ch);
     fclose($fp);

     return (filesize($filepath) > 0)? true : false;
 }
RomanPerekhrest
  • 88,541
  • 4
  • 65
  • 105
  • This is already a improvement, I receive a larger file now. Only now is the problem that it is corrupted. So when I try to open the downloaded ZIP it says. The file is damaged cannot open. Is this something I can fix with something ? – user3824329 Feb 08 '16 at 14:06
  • Now you have disabled curlopt_followlocation. But that does mean that is does not go further then the given URL. But it has to go further, so I enabled it. And when I enabled it I just receive the same incorrect login page again. – user3824329 Feb 08 '16 at 14:11
  • when you store the result of `downloadZipFile` function call into a variable what does it return to you? – RomanPerekhrest Feb 08 '16 at 14:16
  • It gives the following header:HTTP/1.1 302 Found Date: Mon, 08 Feb 2016 14:16:00 GMT Server: Apache Set-Cookie: PHPSESSID=oo8rs47jcbt7e75ba2tghkj86eps9r75b57sn826rhvlb99v2slkjpmh8ms6hu1p2m2ja0sg86oia4ue5ca067a975a81itbddo1ko0; path=/ Expires: Thu, 19 Nov 1981 08:52:00 GMT Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0 Pragma: no-cache Location: /nl/nl/export/download/t/Mg==/c/MTQ=/ Vary: Accept-Encoding Content-Length: 0 Content-Type: text/html; charset=UTF-8 – user3824329 Feb 08 '16 at 14:16
  • And if I just store the result of downloadZipFile, it returns a: 1. And I think that means that is succeeded. – user3824329 Feb 08 '16 at 14:18
  • no, `file_put_contents` returns the number of bytes that were written to the file, or FALSE on failure. – RomanPerekhrest Feb 08 '16 at 14:22
  • Oke I understand that, but the line return ($result !== false)? true : false; makes it return a 1 on success and a 0 on failure. – user3824329 Feb 08 '16 at 14:23
  • Sorry, but I dont receive an error. I really don't know what the problem is here. Any other ideas ? – user3824329 Feb 08 '16 at 14:33
  • yes, download file in regular way and check it's size in bytes. Then put `var_dump($result);` right before `return` operator within the function and invoke `downloadZipFile` function. Compare two sizes: ideally, they must coinside – RomanPerekhrest Feb 08 '16 at 14:40
  • What is the real file size? – RomanPerekhrest Feb 08 '16 at 14:45
  • If I use the var dump I receive: int(478), and if I download the file in my browser the file is 2773Kb. What is the conclusion of that? – user3824329 Feb 08 '16 at 14:48
  • ok, try my second approach, it also works for me. Tell me the result – RomanPerekhrest Feb 08 '16 at 15:01
  • I've tried your second approach now, it seems that we are heading in the right direction. The file that is being download is now 60kb but it still says that is corrupted or damaged and can not be opened. – user3824329 Feb 08 '16 at 15:08
  • it looks like time limits or php size limits. Try to add `set_time_limit(0);` before function call and add this option `curl_setopt($ch, CURLOPT_TIMEOUT, 50);`. Also check for `upload_max_filesize` and `post_max_size` params in `php.ini` config file – RomanPerekhrest Feb 08 '16 at 15:12
  • I've added all those things now, and set them bigger so that it could not be this. But if I open the .zip file in notepadd++. I see that I just get the login page from the website and it says: incorrect login. – user3824329 Feb 08 '16 at 15:26
  • Did you have any other suggestions or not, I want to thank you for all your time and help! – user3824329 Feb 08 '16 at 15:48
  • @RomanPerekhrest — you might consider updating your first approach to remove the CURLOPT_HEADER command. Leaving it as-is appends the header to the front ZIP archive, which can give some decompression engines a hard time. – ams Jan 18 '17 at 15:01
0

Include following lines of code after curl_init() .i think this will work.

CURLOPT_RETURNTRANSFER ::: TRUE to return the transfer as a string of the return value of curl_exec() instead of outputting it out directly.

CURLOPT_USERAGENT::The contents of the "User-Agent: " header to be used in a HTTP request.

Read more about curl_setopt here.

 $ch = curl_init(str_replace(" ","%20",$url));
 curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
 curl_setopt ($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6"); 
Renjith V R
  • 2,981
  • 2
  • 22
  • 32
  • No sorry it still doesn't work. I still get the same incorrect login page back. – user3824329 Feb 08 '16 at 13:09
  • then please check these questions. http://stackoverflow.com/questions/6409462/downloading-a-large-file-using-curl – Renjith V R Feb 08 '16 at 13:13
  • I have used these posts before I posted my data here, I just added the ssl version to be sure. And the other one I had already in my code with the set_timeout(50). So that is not the solution, is there any other thing that I could maybe try ? – user3824329 Feb 08 '16 at 13:24