Restrict access to file - PDF.JS

Question

Please consider the following:

I have a website that users upload content to as PDF's. I would like to restrict access to this content in some way. The plan is for a PHP script to authenticate the user and then load a local PDF using PDF.JS so that is works on all devices.

I am making use of the viewer.js supplied code.

I have tried to use .htaccess to only allow PDF's to load if they come from the server IP address but with no avail - it appears to block any attempts to pull the PDF using PDF.js

Is there a way in PDF.JS to force it to load the file locally, rather than downloading it as a URL? Perhaps then I can just deny all in .htaccess and still allow PDF.js to load it?

Please bear in mind I am using the code found in viewer.js in the web directory of the stable download - I am unable to get any of the "Examples" on the PDF.JS site to work, specifically this line: var pdfjsLib = window['pdfjs-dist/build/pdf']; - This will be down to my limited knowledge. If anyone is able to explain this, bonus.

I am totally open to other ways to solve this problem, and I hope someone can tell me that this is an awful idea and provide a far better way to do it.

Edit

Just to confirm as I don't think I was very clear initially, I still want users to be able to view the content through the webpage that has the PDF.JS, however I don't want just anybody going to the direct URL path and being able to view the content.

`GET` is considered insecure for authentication purposes. "Because you use GET, it is very easy for the key to leak through browser histories, or accidental link sharing.", etc. https://security.stackexchange.com/questions/147188/is-it-bad-practice-to-use-get-method-as-login-username-password-for-administrato — HoldOffHunger, Oct 03 '18 at 16:41
Yeah I understand and would never use this method for any "seriously" secure areas.(usually more to stop bots stumbling on stuff) Thanks for the comment — Aphire, Oct 03 '18 at 16:47

score 3 · Accepted Answer · answered Oct 25 '18 at 14:43

3

Create pdf.php as your endpoint for getting PDF files:

<?php

$file = "tracemonkey.pdf";

if(!$loggedIn) return; // Update with your logic

header("Content-type: application/octet-stream");
header("Content-disposition: attachment;filename=" . $file);

echo file_get_contents(__DIR__ . '/' . $file);

Then in your JS viewer just swap out the URL:

var url = 'pdf.php';

This way PHP acts as kind of a proxy to your files, you'll need to pump in your own logic for grabbing files and what you consider an authenticated user, whether you derive that from the GET or have a file lookup system etc.

answered Oct 25 '18 at 14:43

Royal Wares

1,192
10
23

It's worth noting file_get_contents is tied to your available memory, so if you're running a low powered server Bernz's answer may be more suitable, or on the flip side if you have to deal with very large files again it'll be more suitable as it's taking memory into account in the solution. If your use case if pretty tame then just go for file_get_contents. – Royal Wares Oct 25 '18 at 14:49
Thank you very much for such an elegant solution. I will award Bounty when I can (23 hours I think atm) – Aphire Oct 25 '18 at 15:13

score 1 · Answer 2 · answered Oct 25 '18 at 14:39

The solution we use for this is the following:

Store the PDF files outside of the web directory
Access the PDF through a script (PHP or other, lets call it download_file.php in my case) and feed it to your client using a read file chunked function.

Your script can validate the session, check the user's rights and then read the file and send the correct HEADERs to the user. Therefore, it's much more flexible than simply accessing the file directly.

This way, your PDF.JS file could link to download_file.php?file_id=123123 instead of my_read_file.pdf, where your script could link file_id 123123 to the actual PDF.

My download_file.php script looks something like this:

//$filename : full path to your actual file. NOT located in your web directory
//$mime : mime type of your file

header('Pragma: public');
header('Cache-Control: private');
header('Expires: '.gmdate("D, d M Y H:i:s", strtotime("+2 DAYS", time())). " GMT");
header('Last-Modified: '. gmdate("D, d M Y H:i:s", time()). " GMT");
header('Content-Length: '.filesize($filename)); // Get the file size manually
header('Content-type: '. $mime);

set_time_limit(0);
readfile_chunked($filename);

function readfile_chunked ($filename) {
  $chunksize = 1*(1024*1024); // how many bytes per chunk
  $buffer = '';
  $handle = fopen($filename, 'rb');
  if ($handle === false) {
   return false;
  }
  sleep(1);
  while (!feof($handle)) {
   $buffer = fread($handle, $chunksize);
   //if (strlen($buffer) < $chunksize)
   //   $buffer = str_pad($buffer, $chunksize);
   print $buffer;
   // 2006-01-26: Added 
   flush();
   @ob_flush();
 }
  return fclose($handle);
}

Thank you very much for the answer - it shows a great alternative. — Aphire, Oct 25 '18 at 15:13
For future readers, this answer also works and depending on your use case (i.e. needing support for large files or low server memory) it may be more correct for you. — Royal Wares, Oct 25 '18 at 15:17

Restrict access to file - PDF.JS

2 Answers2