1

I am trying to download a large list of mp4 files by looping through them and using file_put_contents() to save to a directory. The problem is that only the last item in the video list is getting downloaded.

Here is my code:

<?php
$i = 0;
foreach ($response['videos'] as $row){
    $i++;
    if($row['status'] != 'failed') {
        $videoId= '';
        $videoName = '';
        $videoId = $row['key'];
        $videoName = $row['title'];
        $filename = '';
        $filename = str_replace(' ','-',$videoName); // remove spaces from filename created by name
        

        // Initialize a file URL to the variable
        $url = "";
        $url = "http://content.jwplatform.com/videos/{$videoId}.mp4";
          
        // Use file_get_contents() function to get the file
        // from url and use file_put_contents() function to
        // save the file
        if (file_put_contents("Videos/".$filename.$i.".mp4", file_get_contents($url)))
        {
            echo "File downloaded successfully.";
            //sleep(5);
        }
        else
        {
            echo "File downloading failed.";
        }
        
    }
}
?>

I tried to use a CURL function to do this instead of file_put_contents() and it successfully placed all of the files to my Videos directory, but they were all empty files. I believe they were empty because these mp4 URLs are secure videos, so when you open them in the browser they actually bring you to a different secure URL to view and download the video. The CURL function could not get the file data successfully, but it seems like file_get_contents() does get it successfully (only the last item though).

In my code above, I believe what is happening is the variables in the loop are getting overridden over and over until it reaches the last item and then it executes the file_put_contents() function. If that is the case, how can I ensure that it executes the function on each loop so all of the files are downloaded?

Edits: Here is some of the output of var_export($response['videos'])

array ( 0 => array ( 'key' => 'eewww123', 'title' => 'Video Name Example 1', 'description' => NULL, 'date' => 1604004019, 'updated' => 1640011490, 'expires_date' => NULL, 'tags' => NULL, 'link' => NULL, 'author' => NULL, 'size' => '240721720', 'duration' => '229.79', 'md5' => 'f0023423423423423423', 'views' => 0, 'status' => 'ready', 'error' => NULL, 'mediatype' => 'video', 'sourcetype' => 'file', 'sourceurl' => NULL, 'sourceformat' => NULL, 'upload_session_id' => NULL, 'custom' => array ( ), ), 1 => array ( 'key' => 'rr33445', 'title' => 'Another Video Name Example 1', 'description' => '', 'date' => 1594316349, 'updated' => 1640011493, 'expires_date' => NULL, 'tags' => NULL, 'link' => '', 'author' => NULL, 'size' => '525702235', 'duration' => '840.90', 'md5' => '0044455sfsdgsdfs3245', 'views' => 0, 'status' => 'ready', 'error' => NULL, 'mediatype' => 'video', 'sourcetype' => 'file', 'sourceurl' => NULL, 'sourceformat' => NULL, 'upload_session_id' => NULL, 'custom' => array ( ), ), )

None of the rows have a failed status, and there are about 30 rows in total but I have some other video lists to download with 900+ rows.

I enabled error reporting and I see

Fatal error: Allowed memory size of 268435456 bytes exhausted (tried to allocate 132120608 bytes)

on the line where my file_put_contents() function is.

Here is the CURL function I used that worked to download all of the filenames successfully but all of the files were empty:

    function multiple_download(array $urls, $save_path = 'Videos') {
    $multi_handle = curl_multi_init();
    $file_pointers = [];
    $curl_handles = [];

    // Add curl multi handles, one per file we don't already have
    foreach ($urls as $key => $url) {
        $file = $save_path . '/' . basename($url);
        if(!is_file($file)) {
            $curl_handles[$key] = curl_init($url);
            $file_pointers[$key] = fopen($file, "w");
            curl_setopt($curl_handles[$key], CURLOPT_FILE, $file_pointers[$key]);
            curl_setopt($curl_handles[$key], CURLOPT_HEADER, 0);
            curl_setopt($curl_handles[$key], CURLOPT_CONNECTTIMEOUT, 60);
            curl_multi_add_handle($multi_handle,$curl_handles[$key]);
        }
    }

    // Download the files
    do {
        curl_multi_exec($multi_handle,$running);
    } while ($running > 0);

    // Free up objects
    foreach ($urls as $key => $url) {
        curl_multi_remove_handle($multi_handle, $curl_handles[$key]);
        curl_close($curl_handles[$key]);
        fclose ($file_pointers[$key]);
    }
    curl_multi_close($multi_handle);
}



multiple_download($videoURLS);

$videoURLs is an array that I built containing all the unique URLs using the first PHP function above (with the other part commented out).

ADyson
  • 57,178
  • 14
  • 51
  • 63
GLevy
  • 21
  • 4
  • Are you sure that the filname is correct and change every time? If you are able to save just the last item seems that you always overwrite the file because the name doesn't change; try to name the file using a counter for example in order to see what happen – Biagio Boi Dec 22 '21 at 14:57
  • file_put_contents runs inside your loop, so it should write a new file every time. As mentioned above, the only problem might be if the video names are the same each time. If that's a possibility, it might be a good idea to add something unique to each filename before you try to save it. Obviously we can't currently see (a sample of) the content of `$response['videos']` so we can't really see what precisely would happen. – ADyson Dec 22 '21 at 15:02
  • The filenames are definitely unique each time, and I have tried adding a $i counter to the filename as well. This did not help. – GLevy Dec 22 '21 at 15:07
  • Maybe all the videos except one have a "failed" status? Just a guess. It's not obvious how this code would cause this issue. Again, please provide a sample of a few rows of relevant content from `$response['videos']` (e.g. you can show us the result of a var_export command) – ADyson Dec 22 '21 at 15:09
  • P.S. `I tried to use a CURL function to do this instead of file_put_contents() and it successfully placed all of the files to my Videos directory, but they were all empty files`... can you show what you tried, for comparison? On the face of it, this doesn't entirely make sense. – ADyson Dec 22 '21 at 15:13
  • 1
    Do you have legal permission to bulk download and store that content at all? – Daniel W. Dec 22 '21 at 15:15
  • I know they are not failing because if I put an echo statement inside of that condition, all of the URLs are output successfully. – GLevy Dec 22 '21 at 15:18
  • And it outputs "File downloading failed" _n_ times, and "File downloaded successfully" once, is that correct? – ADyson Dec 22 '21 at 15:21
  • I only see "File downloaded successfully" once and "File downloading failed" does not appear at all. It is as if it is skipping over all of the files until it gets to the last one. I found some other thread suggesting to add concatenation .= to the variables and by doing that I was able to get more file names to download, but it was downloading the same URL (data) to those file names instead of the unique mp4s – GLevy Dec 22 '21 at 15:33
  • Yes I have legal permission to download the files. – GLevy Dec 22 '21 at 15:34
  • `I only see "File downloaded successfully" once and "File downloading failed" does not appear at all`...this sounds like potentially the code crashes after the first loop iteration but for some reason you're not seeing an error - maybe check if PHP error reporting is definitely switched on, and/or check your PHP error log. It's hard to see what else it could be. – ADyson Dec 22 '21 at 15:44
  • `suggesting to add concatenation .= to the variables`...to which variables? – ADyson Dec 22 '21 at 15:50
  • That is helpful, thanks. I enabled error reporting and I see "Fatal error: Allowed memory size of 268435456 bytes exhausted (tried to allocate 132120608 bytes)" on the line where my file_put_contents() function is. Any idea how to resolve something like this? I'm primarily a javascript developer so PHP is a bit new to me. – GLevy Dec 22 '21 at 15:53
  • I already tried adding ini_set("memory_limit","128M"); to the top of my file, but the error still appears. Is there a way I can decrease the memory required for this by updating my function? – GLevy Dec 22 '21 at 16:01
  • Well it's interesting it didn't do this with the cURL one, by the sounds of it? It's not really clear why you stopped using cURL instead of file_get_contents. There are also quite a lot of other suggestions if you [search a bit](https://www.google.com/search?q=php+file_get_contents+download+out+of+memory) – ADyson Dec 22 '21 at 16:24
  • Wow that link helped a lot, I got it working from following the first item in the search. I will post the solution in the main details above. – GLevy Dec 22 '21 at 16:34
  • 1
    No, please post the solution as an Answer below, that's how stackoverflow works! – ADyson Dec 22 '21 at 16:41
  • 1
    Probable duplicate of [file\_put\_contents and file\_get\_contents exhaust memory size](https://stackoverflow.com/questions/14713014/file-put-contents-and-file-get-contents-exhaust-memory-size) – ADyson Dec 22 '21 at 16:46

2 Answers2

1

It turns out the issue was that file_get_contents was exhausting the memory size. From this post, I used the following function

function custom_put_contents($source_url='',$local_path=''){

    $time_limit = ini_get('max_execution_time');
    $memory_limit = ini_get('memory_limit');

    set_time_limit(0);
    ini_set('memory_limit', '-1');      

    $remote_contents=file_get_contents($source_url);
    $response=file_put_contents($local_path, $remote_contents);

    set_time_limit($time_limit);
    ini_set('memory_limit', $memory_limit); 

    return $response;
}

This effectively sets the memory to unlimited so the file can be retrieved and then it restores the memory back to original state after it is done. With this function I was able to download the files.

GLevy
  • 21
  • 4
  • Did you try using `copy()` to eliminate the memory issue? See this answer: https://stackoverflow.com/a/1372144/4630325 – Markus AO Dec 22 '21 at 17:44
  • Ugh wish I tried that first, but at least I learned a thing or two. Copy did work perfectly and it was way easier. – GLevy Dec 22 '21 at 20:09
-2

You must use a flag to append to the file instead of overwrite.

See documentation https://www.php.net/manual/fr/function.file-put-contents.php Flag FILE_APPEND

Edit: if all file have the same name, it is possible it overwrite them. You must provide diferent name in your loop.

foreach ($response['videos'] as $key => $row) {
    ...
    if (file_put_contents("Videos/" . $filename .$key ".mp4", file_get_contents($url))) {
   ...

Using the $key of the loop in your file name make it uniq and will not be overwritten

Florent Cardot
  • 1,400
  • 1
  • 10
  • 19
  • I have tried doing this, but it just makes the same mp4 file double in size each time I reload the browser. It does not help with the issue of only downloading the last file. – GLevy Dec 22 '21 at 14:56
  • so all other file are empty? or are they not created? – Florent Cardot Dec 22 '21 at 15:02
  • They are not created. Only the last file is created and it is not empty, it works. They were only all created/empty when I used an entirely different function (CURL) not shown above. – GLevy Dec 22 '21 at 15:04