1

In a reply to the question check manually for jpeg end of file marker ffd9 (?) in php to catch truncation errors the EOF chars for a jpg are shown as "\xFF\xD9". When I use this the script traps what appear to be valid jpgs, whereas if I remove the 'x' characters it allows them. What is the purpose of the x's and does it matter if they are omitted?

$imgdata = fopen($uploadfile, 'r'); // 'r' is for reading
fseek($imgdata, -2, SEEK_END); // move to EOF -2
$eofdata = fread($imgdata, 2);
fclose($imgdata);
switch ($mimetype) {
    case "image/jpeg":
        $eof = "\xFF\xD9";
        break;
    case "image/pjpeg":
        $eof = "\xFF\xD9";
        break;
    case "image/png":
        $eof = "\60\82";
        break;
}
if ($eofdata != $eof) {
    $valid = false;
}
Community
  • 1
  • 1
Nick Iredale
  • 79
  • 2
  • 11

1 Answers1

3

In a PHP string literal, \x is used to indicate that you're entering a character code in hexadecimal. So "\xFF" is a string whose character has the hex code FF, which is decimal 255.

If you leave out \x, "FF" is a string with two F characters in it.

See the complete list of PHP escape codes in the string documentation.

If it's not working correctly for you, make sure the string is enclosed in double quotes, not single quotes. What is the difference between single-quoted and double-quoted strings in PHP?

I checked some of my JPEG files, and the EOF marker is not always at the end of the file. If you want to ensure that the file isn't truncated, search for it anywhere in the file, not just at the end.

$contents = file_get_contents($uploadfile);
if (strpos($contents, $eof) === false) {
    $valid = false;
}

According to Wikipedia this is not actually an EOF sequence, it's the End of Image marker. If there are multiple image segments in the file, it will appear multiple times -- each image is surrounded by FFD8 Start of Image and FFD9 End of Image.

And see also JPEG EOF vs EOI marker

Community
  • 1
  • 1
Barmar
  • 741,623
  • 53
  • 500
  • 612
  • I just looked at one of my jpeg files with a hex editor, and after FFD9 there's a space and newline. So I guess the answer in the other question isn't really accurate. – Barmar Apr 27 '16 at 16:19
  • I checked a different file, and it did end with FFD9. I think the idea is that FFD9 marks the end of the data, and everything after that should be ignored. So it doesn't have to be literally at the end of the file. – Barmar Apr 27 '16 at 16:22
  • It looks like jpegs not ending with FFD9 fail with imagecreatefromjpeg - I haven't found any that don't end with that if they are capable of being handled by it – Nick Iredale Apr 27 '16 at 16:32
  • I just tried it with my file that ends with `\xff\xd9\x20\x0a` and it worked. – Barmar Apr 27 '16 at 16:35
  • I wonder if that is another 'valid' EOF or if there can be any characters after `\xff\xd9`and if so, how many. I'll take a look at some. – Nick Iredale Apr 30 '16 at 05:22
  • I've looked at around 20 jpegs from different sources and all end in `\xff\xd9`. In some both `\xff\xd8` and `\xff\xd9`appear three times - presumably defining sections such as an embedded thumbnail and the main image. – Nick Iredale Apr 30 '16 at 05:42
  • @NickIredale If there are multiple image segments in a file, each will start with `FFD8` and end with `FFD9`. – Barmar Apr 30 '16 at 18:14