1

few days ago I asked a question in a different field and finally a friend(@emcconville) helped me with a script for "Recover every JPEG files in a single file" . Now I realized that this program only works on images with the standard "JFIF" and is not capable of retrieving images with "EXIF" standard (Images taken by digital cameras).

How to change the program so that it can also know the Exif standard in Images? I'm not familiar with Python, and I do not know the power of that.

Thanks

import struct

with open('src.bin', 'rb') as f:
    # Calculate file size.
    f.seek(0, 2)
    total_bytes = f.tell()
    # Rewind to beging.
    f.seek(0)
    file_cursor = f.tell()
    image_cursor = 0

    while file_cursor < total_bytes:
        # Can for start of JPEG.
        if f.read(1) == b"\xFF":
            if f.read(3) == b"\xD8\xFF\xE0":
                print("JPEG FOUND!")
                # Backup and find the size of the image
                f.seek(-8, 1)
                payload_size = struct.unpack('<I', f.read(4))[0]
                # Write image to disk
                d_filename = 'image{0}.jpeg'.format(image_cursor)
                with open(d_filename, 'wb') as d:
                    d.write(f.read(payload_size))
                image_cursor += 1
        file_cursor = f.tell()
Joe
  • 6,758
  • 2
  • 26
  • 47
Little Elite
  • 65
  • 13

1 Answers1

5

EXIF files have a marker of 0xffe1, JFIF files have a marker of 0xffe0. So all code that relies on 0xffe0 to detect a JPEG file will miss all EXIF files. (from here)

So just change

if f.read(3) == b"\xD8\xFF\xE0":

to

if f.read(3) == b"\xD8\xFF\xE1":

If you want to check for both cases, do not use .read() like that anymore. Instead something like

x = f.read(3)
if x in (b"\xD8\xFF\xE0", b"\xD8\xFF\xE1"):
Joe
  • 6,758
  • 2
  • 26
  • 47
  • Thanks it works like a charm . but if I use a "or" in "if f.read(3)" to let program recover both JFIF and EXIF images from a file it'll fail ! why it can't do that? – Little Elite Aug 21 '18 at 04:32
  • 1
    Because the `f.read()` is really reading data from the file at the position where a marker is at. After each read the marker is moved, in your case by three bytes. If you are using one `.read(3)` operation the read-marker is moved and you are no longer at the correct position. Instead use something like I added above. – Joe Aug 21 '18 at 04:38
  • What a ridiculous mistake I made! you are right . Thanks a lot – Little Elite Aug 21 '18 at 06:14