0

I'm trying to figure out how to determine what extension a file has when I receive it using $content = file_get_contents("php://input");. I use jQuery to send it to upload.php as a POST request (xhr).

I include the filename, size and type in a request header but I don't know how to fetch the header from file_get_contents().

Thank you!

Magnus
  • 391
  • 1
  • 7
  • 35
  • Take a look at [magic number database](http://stackoverflow.com/questions/6024441/determining-filetype-with-php-what-is-magic-database) – mseifert Mar 21 '14 at 19:47
  • @mseifert and for `$filename` in that case I simply input `php://input`? – Magnus Mar 21 '14 at 19:49
  • Examine the first few bytes of $content and interpret it as a header for a finite set of known filetypes. Another [link](http://stackoverflow.com/questions/481743/how-can-i-determine-a-files-true-extension-type-programatically?lq=1) – mseifert Mar 21 '14 at 19:52
  • Do you have any example code or anything because I honestly have no clue on how to write that code.. @mseifert – Magnus Mar 21 '14 at 20:04

1 Answers1

1

Here is a working example using test files. Substitute $content from $content = file_get_contents() in your example. You can find signatures for other file types here. I chose to convert the first 12 characters to hex of each file. If you have file signatures that are longer, you can increase that number. For file size, you can use strlen($content) or try filesize("php://input")

<?php
    $content = file_get_contents("testjpg.jpg");
    $a = strToHex($content, 12);
    var_dump($a);
    echo getfiletype($a) . "<br>";

    $content = file_get_contents("testdoc.doc");
    $a = strToHex($content, 12);
    var_dump($a);
    echo getfiletype($a) . "<br>";

    $content = file_get_contents("testpdf.pdf");
    $a = strToHex($content, 12);
    var_dump($a);
    echo getfiletype($a) . "<br>";


function getfiletype($test){
    if (testsig($test, "FF D8 FF")){
        return "jpeg";
    }
    elseif (testsig($test, "25 50 44 46")){
        return "pdf";
    }
    elseif (testsig($test, "D0 CF 11 E0 A1 B1 1A E1")){
        return "doc";
    }
    else{
        return "unknown";
    }
}

function testsig($test, $sig){
    // remove spaces in sig
    $sig = str_replace(" ","", $sig);
    if (substr($test, 0, strlen($sig)) == $sig){
            return true;
    }
    return false;
}



function strToHex($string, $stop=null){
    $hex = "";
    if ($stop == null){
        $stop = strlen($string);
    }
    $stop = min(strlen($string), $stop);

    for ($i=0; $i<$stop; $i++){
        $ord = ord($string[$i]);
        $hexCode = dechex($ord);
        $hex .= substr('0'.$hexCode, -2);
    }
    return strtoupper($hex);
}
?>

The result of the code is:

string 'FFD8FFE1034D457869660000' (length=24)

jpeg

string 'D0CF11E0A1B11AE100000000' (length=24)

doc

string '255044462D312E360D25E2E3' (length=24)

pdf
mseifert
  • 5,390
  • 9
  • 38
  • 100