0

I want to play video from a remote server. so I write this function.

$remoteFile = 'blabla.com/video_5GB.mp4';
play($remoteFile);
function play($url){
    ini_set('memory_limit', '1024M');
    set_time_limit(3600);
    ob_start();
    if (isset($_SERVER['HTTP_RANGE'])) $opts['http']['header'] = "Range: " . $_SERVER['HTTP_RANGE'];
    $opts['http']['method'] = "HEAD";
    $conh = stream_context_create($opts);
    $opts['http']['method'] = "GET";
    $cong = stream_context_create($opts);
    $out[] = file_get_contents($url, false, $conh);
    $out[] = $httap_response_header;
    ob_end_clean();
    array_map("header", $http_response_header);
    readfile($url, false, $cong);
}

The above function works very well in playing videos. But I don't want to burden the remote server

My question is how can I cache video files every 5 hours to my server. if possible, the cache folder contains small files (5MB / 10MB) from remote video

Leo
  • 3
  • 2
  • Does this answer your question? [php stream file from remote server](https://stackoverflow.com/questions/30529844/php-stream-file-from-remote-server) – RïshïKêsh Kümar Jul 20 '20 at 10:00
  • nope. I want to cache the video from remote video to my server – Leo Jul 20 '20 at 10:05
  • _“if possible, the cache folder contains small files (5MB / 10MB) from remote video”_ - you mean, you want to cache one _big_ video in multiple such small chunks on your end? It doubt that makes much sense. If the clients send range requests, then you probably won’t know how big those ranges will be beforehand. Meaning, in the worst case, you would have to assemble the response by concatenating multiple of those smaller chunks, and probably even cutting parts from the first and last chunk. […] – CBroe Jul 20 '20 at 11:03
  • […] And if they don’t, you will still have to concatenate _all_ chunks back together to send the whole video in one go. If anything, you should cache the whole video file on your server. Let your web server then handle serving the proper responses to any range requests. – CBroe Jul 20 '20 at 11:04
  • This intrigued me and I have managed to do it (only tested with MP4). It relies on calling a php script with exec() to generate the cache while the main request serves the video from the remote url. Once the cache has been generated it serves the video from there. Files broken down into 10MB chunks as needed. Will calling the php with exec() work for your server? If so I can share the code. – Mark Jul 21 '20 at 17:41
  • Yes. @Mark, please share the code :) – Leo Jul 22 '20 at 06:52

1 Answers1

0

As mentioned in my comment, the following code has been tested only on a small selection of MP4 files. It could probably do with some more work but it does fill your immediate needs as it is.

It uses exec() to spawn a separate process that generates the cache files when they are needed, i.e. on the first request or after 5 hours. Each video must have its own cache folder because the cached chunks are simply called 1, 2, 3, etc. Please see additional comments in the code.

play.php - This is the script that will be called by the users from the browser

<?php
ini_set('memory_limit', '1024M');
set_time_limit(3600);

$remoteFile = 'blabla.com/video_5GB.mp4';

play($remoteFile);

/**
 * @param string $url
 *
 * This will serve the video from the remote url
 */
function playFromRemote($url)
{
  ob_start();
  $opts = array();
  if(isset($_SERVER['HTTP_RANGE']))
  {
    $opts['http']['header'] = "Range: ".$_SERVER['HTTP_RANGE'];
  }
  $opts['http']['method'] = "HEAD";
  $conh = stream_context_create($opts);
  $opts['http']['method'] = "GET";
  $cong = stream_context_create($opts);
  $out[] = file_get_contents($url, false, $conh);
  $out[] = $http_response_header;
  ob_end_clean();

  $fh = fopen('response.log', 'a');
  if($fh !== false)
  {
    fwrite($fh, print_r($http_response_header, true)."\n\n\n\n");
    fclose($fh);
  }

  array_map("header", $http_response_header);
  readfile($url, false, $cong);
}

/**
 * @param string $cacheFolder Directory in which to find the cached chunk files
 * @param string $url
 *
 * This will serve the video from the cache, it uses a "completed.log" file which holds the byte ranges of each chunk
 * this makes it easier to locate the first chunk of a range request. The file is generated by the cache script
 */
function playFromCache($cacheFolder, $url)
{
  $bytesFrom = 0;
  $bytesTo = 0;
  if(isset($_SERVER['HTTP_RANGE']))
  {
    //the client asked for a specific range, extract those from the http_range server var
    //can take the form "bytes=123-567" or just a from "bytes=123-"
    $matches = array();
    if(preg_match('/^bytes=(\d+)-(\d+)?$/', $_SERVER['HTTP_RANGE'], $matches))
    {
      $bytesFrom = intval($matches[1]);
      if(!empty($matches[2]))
      {
        $bytesTo = intval($matches[2]);
      }
    }
  }

  //completed log is a json_encoded file containing an array or byte ranges that directly
  //correspond with the chunk files generated by the cache script
  $log = json_decode(file_get_contents($cacheFolder.DIRECTORY_SEPARATOR.'completed.log'));
  $totalBytes = 0;
  $chunk = 0;
  foreach($log as $ind => $bytes)
  {
    //find the first chunk file we need to open
    if($bytes[0] <= $bytesFrom && $bytes[1] > $bytesFrom)
    {
      $chunk = $ind + 1;
    }
    //and while we are at it save the last byte range "to" which is the total number of bytes of all the chunk files
    $totalBytes = $bytes[1];
  }

  if($bytesTo === 0)
  {
    if($totalBytes === 0)
    {
      //if we get here then something is wrong with the cache, revert to serving from the remote
      playFromRemote($url);
      return;
    }
    $bytesTo = $totalBytes - 1;
  }

  //calculate how many bytes will be returned in this request
  $contentLength = $bytesTo - $bytesFrom + 1;

  //send some headers - I have hardcoded MP4 here because that is all I have developed with
  //if you are using different video formats then testing and changes will no doubt be required
  header('Content-Type: video/mp4');
  header('Content-Length: '.$contentLength);
  header('Accept-Ranges: bytes');

  //Send a header so we can recognise that the content was indeed served by the cache
  header('X-Cached-Date: '.(date('Y-m-d H:i:s', filemtime($cacheFolder.DIRECTORY_SEPARATOR.'completed.log'))));
  if($bytesFrom > 0)
  {
    //We are sending back a range so it needs a header and the http response must be 206: Partial Content
    header(sprintf('content-range: bytes %s-%s/%s', $bytesFrom, $bytesTo, $totalBytes));
    http_response_code(206);
  }

  $bytesSent = 0;
  while(is_file($cacheFolder.DIRECTORY_SEPARATOR.$chunk) && $bytesSent < $contentLength)
  {
    $cfh = fopen($cacheFolder.DIRECTORY_SEPARATOR.$chunk, 'rb');
    if($cfh !== false)
    {
      //if we are fetching a range then we might need to seek the correct starting point in the first chunk we look at
      //this check will be performed on all chunks but only the first one should need seeking so no harm done
      if($log[$chunk - 1][0] < $bytesFrom)
      {
        fseek($cfh, $bytesFrom - $log[$chunk - 1][0]);
      }
      //read and send data until the end of the file or we have sent what was requested
      while(!feof($cfh) && $bytesSent < $contentLength)
      {
        $data = fread($cfh, 1024);
        //check we are not going to be sending too much back and if we are then truncate the data to the correct length
        if($bytesSent + strlen($data) > $contentLength)
        {
          $data = substr($data, 0, $contentLength - $bytesSent);
        }
        $bytesSent += strlen($data);
        echo $data;
      }
      fclose($cfh);
    }
    //move to the next chunk
    $chunk ++;
  }
}

function play($url)
{
  //I have chosen a simple way to make a folder name, this can be improved any way you need
  //IMPORTANT: Each video must have its own cache folder
  $cacheFolder = sha1($url);
  if(!is_dir($cacheFolder))
  {
    mkdir($cacheFolder, 0755, true);
  }

  //First check if we are currently in the process of generating the cache and so just play from remote
  if(is_file($cacheFolder.DIRECTORY_SEPARATOR.'caching.log'))
  {
    playFromRemote($url);
  }
  //Otherwise check if we have never completed the cache or it was completed 5 hours ago and if so spawn a process to generate the cache
  elseif(!is_file($cacheFolder.DIRECTORY_SEPARATOR.'completed.log') || filemtime($cacheFolder.DIRECTORY_SEPARATOR.'completed.log') + (5 * 60 * 60) < time())
  {
    //fork the caching to a separate process - the & echo $! at the end causes the process to run as a background task
    //and print the process ID returning immediately
    //The cache script can be anywhere, pass the location to sprintf in the first position
    //A base64 encoded url is passed in as argument 1, sprintf second position
    $cmd = sprintf('php %scache.php %s & echo $!', __DIR__.DIRECTORY_SEPARATOR, base64_encode($url));
    $pid = exec($cmd);

    //with that started we need to serve the request from the remote url
    playFromRemote($url);
  }
  else
  {
    //if we got this far then we have a completed cache so serve from there
    playFromCache($cacheFolder, $url);
  }
}

cache.php - This script will be called by play.php via exec()

<?php
//This script expects as argument 1 a base64 encoded url
if(count($argv)!==2)
{
  die('Invalid Request!');
}

$url = base64_decode($argv[1]);

//make sure to use the same method of obtaining the cache folder name as the main play script
//or change the code to pass it in as an argument
$cacheFolder = sha1($url);

if(!is_dir($cacheFolder))
{
  die('Invalid Arguments!');
}

//double check it is not already running
if(is_file($cacheFolder.DIRECTORY_SEPARATOR.'caching.log'))
{
  die('Already Running');
}

//create a file so we know this has started, the file will be removed at the end of the script
file_put_contents($cacheFolder.DIRECTORY_SEPARATOR.'caching.log', date('d/m/Y H:i:s'));

//get rid of the old completed log
if(is_file($cacheFolder.DIRECTORY_SEPARATOR.'completed.log'))
{
  unlink($cacheFolder.DIRECTORY_SEPARATOR.'completed.log');
}

$bytesFrom = 0;
$bytesWritten = 0;
$totalBytes = 0;

//this is the size of the chunk files, currently 10MB
$maxSizeInBytes = 10 * 1024 * 1024;
$chunk = 1;

//open the url for binary reading and first chunk for binary writing
$fh = fopen($url, 'rb');
$cfh = fopen($cacheFolder.DIRECTORY_SEPARATOR.$chunk, 'wb');

if($fh !== false && $cfh!==false)
{
  $log = array();
  while(!feof($fh))
  {
    $data = fread($fh, 1024);
    fwrite($cfh, $data);
    $totalBytes += strlen($data); //use actual length here
    $bytesWritten += strlen($data);

    //if we are on or passed the chunk size then close the chunk and open a new one
    //keeping a log of the byte range of the chunk
    if($bytesWritten>=$maxSizeInBytes)
    {
      $log[$chunk-1] = array($bytesFrom,$totalBytes);
      $bytesFrom = $totalBytes;
      fclose($cfh);
      $chunk++;
      $bytesWritten = 0;
      $cfh = fopen($cacheFolder.DIRECTORY_SEPARATOR.$chunk, 'wb');
    }
  }
  fclose($fh);
  $log[$chunk-1] = array($bytesFrom,$totalBytes);
  fclose($cfh);

  //write the completed log. This is a json encoded string of the chunk byte ranges and will be used
  //by the play script to quickly locate the starting chunk of a range request
  file_put_contents($cacheFolder.DIRECTORY_SEPARATOR.'completed.log', json_encode($log));

  //finally remove the caching log so the play script doesn't think the process is still running
  unlink($cacheFolder.DIRECTORY_SEPARATOR.'caching.log');
}
Mark
  • 1,006
  • 1
  • 6
  • 6