14

Possible Duplicate:
How to check file types of uploaded files in PHP?

I have uploading features on my site and only PDF uploads are allowed. How can I check that the uploaded file is only a PDF? Just like getimagesize() can verify image files. Is there any way to check the file is a PDF? My code is below:

$whitelist = array(".pdf");

foreach ($whitelist as $item) {
    if (preg_match("/$item\$/i", $_FILES['uploadfile']['name'])) {
        
    }
    else {
        redirect_to("index.php");
    }
}

$uploaddir = 'uploads/';

$uploadfile = mysql_prep($uploaddir . basename($_FILES['uploadfile']['name']));

if (move_uploaded_file($_FILES['uploadfile']['tmp_name'], $uploadfile)) {
    echo "succussfully uploaded";
}

Functions redirect_to and mysql_prep are defined by me. But mime type can be changed using headers. So is there any way to check the file to be an original pdf?

John Kary
  • 6,703
  • 1
  • 24
  • 24
StaticVariable
  • 5,253
  • 4
  • 23
  • 45
  • 1
    Why do you have an empty `if`? And please use proper indentation for your code or bad things will happen. – PeeHaa Jun 14 '12 at 18:25
  • I just wanted to check is it maches or not..? – StaticVariable Jun 14 '12 at 18:31
  • Simply do: `if (!preg_match("/$item\$/i", $_FILES['uploadfile']['name'])) { redirect_to("index.php"); }` – PeeHaa Jun 14 '12 at 18:32
  • this is not the answer @PeeHaa ..i have done this....how to check pdf entries just like getimagesize() function to check image files – StaticVariable Jun 14 '12 at 18:35
  • 4
    Wasn't an answer, but it is a comment. I'm telling you how to improve your code :) – PeeHaa Jun 14 '12 at 18:36
  • http://stackoverflow.com/questions/8309665/how-to-upload-pdf-file-only 1. Validate extensions (pdf, doc, docx) *almost useless 2. Validate MIME 2. Open PDF file, read the header (first line) and check if it contains one of these strings: %PDF-1.0, %PDF-1.1, %PDF-1.2, %PDF-1.3, %PDF-1.4 3. Check if the file contains a string that specifies the number of pages by searching for multiple "/Page" – Artur Kedzior Jan 30 '13 at 12:32

3 Answers3

20

You can check the MIME type of the file using PHP's File Info Functions. If it returns with the type 'application/pdf' then it should be a PDF.

The File Info Functions were added in PHP 5.3 but previous to that you are able to use the mime_content_type function.

Ben Evans
  • 1,567
  • 16
  • 15
  • 2
    but if i have a php file and write(header: type="application/pdf")...than it will also show me the same – StaticVariable Jun 14 '12 at 18:33
  • That's true, but that is less likely and it depends how critical this is – Ben Evans Jun 14 '12 at 18:37
  • +1 that's only correct answer here. file info function provide a way to get **true*** mime type of the file – Sarfraz Jun 14 '12 at 18:38
  • i want to overcome from my problem...the thing you are telling i have already checked that....Do you have any answer – StaticVariable Jun 14 '12 at 18:44
  • i don't want the mime type...i want to check the content of the file and check that is it php or not.? – StaticVariable Jun 14 '12 at 18:46
  • 3
    It should be noted that (as mentioned [elsewhere on SO](https://stackoverflow.com/a/39676272/1654898)) the `mime_content_type()` function has [not been deprecated](http://php.net/manual/en/function.mime-content-type.php), and is explicitly mentioned as included in PHP7. – Alex Currie-Clark Aug 24 '17 at 10:49
  • Correct, @BenEvans you may care to update your answer as this is a prominent but incorrect response, and it'll save a lot of hassle for people who don't need to install the extension. (FYI, it seems there may have been a plan to deprecate it, but that's been reversed.) – Jeremy L. Nov 17 '17 at 18:49
  • I'd like to add that I hit upon a pdf-file which is a normal pdf-file in every way as far as I can tell, but mime_content_type($filename) as well as (new finfo(FILEINFO_MIME_TYPE))->file($filename) returns "application/octet-stream" – Clox Feb 24 '22 at 08:28
12
mime_content_type('file.ext');

mime_content_type()

HBv6
  • 3,487
  • 4
  • 30
  • 43
  • 4
    But please note that it [is deprecated](http://php.net/manual/en/function.mime-content-type.php). – PeeHaa Jun 14 '12 at 18:36
  • 2
    @PeeHaa can you update your response? This has not been deprecated. (See the link.) Maybe it was brought back to life? – Jeremy L. Nov 17 '17 at 18:51
  • 1
    The change has been reverted in [52d6b9aa9b0000744b727e4a596539371f06fd11](https://github.com/salathe/phpdoc-en/commit/52d6b9aa9b0000744b727e4a596539371f06fd11). PHP's bug site doesn't load at the moment, but once it is up again you can checkout https://bugs.php.net/bug.php?id=71367 to see the reason for it. @daprezjer – PeeHaa Nov 18 '17 at 00:12
11

Look for the PDF magic number by opening the file and reading the first few bytes of data. Most files have a specific format, and PDF files start with %PDF.

You can check the first 5 characters of the file, if they equal "%PDF-", it is likely a real PDF (however, this does not definitively prove that it is a PDF file, as any file can begin with those 5 characters). The next 4 characters in a proper PDF file contain the version number (i.e. 1.2).

nickb
  • 59,313
  • 13
  • 108
  • 143
Mike
  • 111
  • 2
  • 1
    That's a VERY expensive solution! :D – HBv6 Jun 14 '12 at 18:34
  • 1
    Isn't this basically what the MIME type checks also do, but in a cheaper way? – Frog Jun 14 '12 at 18:34
  • 2
    Good point, it is essentially what a mime type check is probably already doing. But, if you do want to be more certain it is a valid PDF file (and you don't mind the extra processing time), you could scan the file for other expected constructs, such as the PDF "%%EOF" marker at the end of the file. (assuming this is more than what the mime check does) – Mike Jun 14 '12 at 18:48
  • that's a very insecure solution, I would replace the first few bytes with the bytes I want it to look like if I was trying to beat that protection. Very easy to do. – sf_admin Oct 27 '21 at 17:56