3

I have a corrupted raw file I want to recover (click here to download it if you want). I think the jpeg inside the raw file is intact, so I've been trying to extract it in different ways, using free software like ERawP or even writing a simple python program

However the problem with these methods is that neither the programs nor the Python libraries see the file as a raw file in the first place

Any advice?

EDIT: the photo was taken with a Fujifilm camera in case that's relevant

Thanks

Here is the code I was using:

import rawpy
import imageio

path = '/pathToFile/_FFF9198.RAF'
raw = rawpy.imread(path)
rgb = raw.postprocess()
imageio.imsave('findMe.tiff', rgb)
  • 1
    "Cool", maybe, but I'm not sure it's in-scope. There's nothing narrow and specific about needing to figure out how to parse a file with a type of corruption unknown even to the OP. A good Stack Overflow question has enough research already done that everything needed to answer or to evaluate an answer can be included in the question itself; that's really not the case here. – Charles Duffy Jul 07 '21 at 03:21
  • 1
    BTW, `recoverjpeg` pulls quite a lot of JPEGs out of the file in question. (Certainly not an on-topic answer here, though, as it's not an answer that describes how to develop software, and tool recommendations are off-topic to begin with). – Charles Duffy Jul 07 '21 at 03:29
  • 1
    ...see https://rfc1149.net/devel/recoverjpeg.html for the aforementioned tool; also at https://github.com/samueltardieu/recoverjpeg – Charles Duffy Jul 07 '21 at 03:31
  • See [Restoration of a corrupted JPEG file](https://photo.stackexchange.com/q/116503) – wovano Sep 28 '22 at 13:15

1 Answers1

1

File is literally filled with JPEGs.

enter image description here

App works really simple:

  1. Any JPEG starts with FF D8 FF byte sequence
  2. JPEG does not care about file size being precise as long as not too small
  3. JPEG ends with FF D9, most decoders (viewers) simply stop decoding at that point

So app scans for FF D8 FF > opens new file and copies all data from that byte sequence to new file without worrying about FF D9 marker. This is the dumb way to do this.

IOW the source file:

... FF D8 FF ... FF D9 .. FF D8 FF .. FF D9 .. etc. .. END

Whenever you encounter byte sequence FF D8 FF you copy that AND everything that follows to new file. This is dumb way.

Smart way:

We need to learn about JPEG:

JPEG file is organized around so called markers. A marker is FF nn. EVERY time a JPEG decoder sees FF nn (nn = some value) it will think, this is a JPEG marker. Here's a list of possible markers that I maintain on my website.

JPEG start with FF D8 and then is immediate followed by first marker, FF nn. Next two bytes are size of this section. So offset + size gets us next marker. This is true for all markers except FF D8 (start file), FF DA (start image data) and FF D9 (end of image data).

So:

FF D8 - FF nn xx xx - FF nn xx xx - FF DA .......... JPEG image data ....... FF D9.

So smart way to carve JPEG is follow chain of markers starting at FF D8 till we find FF DA, then everything is JPEG data until we encounter FF D9.

If we keep in mind that each time a JPEG decoder sees FF nn it will assume it's a JPEG marker we can explain how we can sometimes repair half grey JPEG files: If we have FE 89 as some bytes in JPEG data and one single bit flips we could have FF 89 all of a sudden. Most JPEG decoders will choke as they can not determine what marker FF 89 is supposed to do. If we change the byte combo to anything NOT 'FF nn' decoder will happily continue although the image may look distorted.