0

Is it possible to check the response headers (200=OK) and download a file in a single CURL request?

Here is my code. The problem with this is that it makes 2 requests, and hence the second request can be different and the saved file will be overwritten. This is a problem with rate limited API. I searched here on Stackoverflow but most solutions still make 2 requests.

    // Check response first, we don't want to download the response error to the file
$urlCheck = checkRemoteFile($to_download);

if ($urlCheck) {
    // Response is 200, continue
} else {
    // Do not overwrite existing file
    echo 'Download failed, response code header is not 200';
    exit();
}

// File Handling
$new_file = fopen($downloaded, "w") or die("cannot open" . $downloaded);

// Setting the curl operations
$cd = curl_init();
curl_setopt($cd, CURLOPT_URL, $to_download);
curl_setopt($cd, CURLOPT_FILE, $new_file);
curl_setopt($cd, CURLOPT_TIMEOUT, 30); // timeout is 30 seconds, to download the large files you may need to increase the timeout limit.

// Running curl to download file
curl_exec($cd);
if (curl_errno($cd)) {
    echo "the cURL error is : " . curl_error($cd);
} else {
    $status = curl_getinfo($cd);
    echo $status["http_code"] == 200 ? "File Downloaded" : "The error code is : " . $status["http_code"] ;
    // the http status 200 means everything is going well. the error codes can be 401, 403 or 404.
}

// close and finalize the operations.
curl_close($cd);
fclose($new_file);

# FUNCTIONS

function checkRemoteFile($url) {
    $curl = curl_init($url);

    //don't fetch the actual page, you only want to check the connection is ok
    curl_setopt($curl, CURLOPT_NOBODY, true);

    //do request
    $result = curl_exec($curl);

    $ret = false;

    //if request did not fail
    if ($result !== false) {
        //if request was ok, check response code
        $statusCode = curl_getinfo($curl, CURLINFO_HTTP_CODE);

        if ($statusCode == 200) {
            $ret = true;
        }
    }

    curl_close($curl);

    return $ret;
}
nibb11
  • 25
  • 6
  • Does this answer your question? [Remote file size without downloading file](https://stackoverflow.com/questions/2602612/remote-file-size-without-downloading-file) – math deaman Jul 24 '22 at 06:32
  • For “the saved file will be overwritten”, always write to a unique tmp file and only move it if whatever criteria succeeds. For the rest, HEAD requests don’t always have all of the information, but they might work for your scenario. Yes, a HEAD request is a request none the less, so HEAD followed by GET will be two requests. Servers generally want to dump and close the connection, so there’s no way for the server to send headers and wait for you to send notice about whether you want the body or not. – Chris Haas Jul 24 '22 at 13:10
  • The problem with those solutions is that they still make two requests. Same for temporary file, header is counted as a requested in an API. Imagine this, 1 request = Response 200 but the second request is 401 access denied. You just wrote 401 to the temporary file and not the actual result from the API. – nibb11 Jul 24 '22 at 13:18
  • 1
    Yes, and I’m saying that’s the nature of HTTP, and also why I said HEAD requests might not be viable for many situations. The solution is really to just make a single GET request, download the whole thing, inspect status code and perform logic. The only alternative is to ask the API developer to open up HEAD requests or somehow change their API. For instance, the API could return a custom meta response that you could act upon if you so chose. But that isn’t likely – Chris Haas Jul 24 '22 at 13:33
  • Thanks, I think that might be the only way, to actually inspect the downloaded file in a temporary file placeholder, then rename if successful. I was just curious if it was possible maybe by saving the CURL request into memory somehow before writting to disk but it seems it not possible to keep everything in a single HTTP request. – nibb11 Jul 24 '22 at 17:56

0 Answers0