I was wondering if there was a way to determine in Python (or another language) to open a JPEG file, and determine whether or not it is corrupt (for instance, if I terminate a download for a JPG file before it completes, then I am unable to open the file and view it)? Are there libraries that allow this to be done easily?
-
1Maybe because it's a duplicate of http://stackoverflow.com/questions/889333/how-to-check-if-a-file-is-a-valid-image-file – Lev Levitsky Aug 21 '12 at 09:08
-
Thanks Lev! Isn't it better to post the link to the duplicate question than to downvote though? -_- I tried to look for an answer but I didn't seem the above question. – Raymond Aug 21 '12 at 09:12
-
1I'm not the downvoter, so there can be other reasons. Maybe someone didn't find enough effort on your side in the question (there's no code and no links in it). Anyway, one downvote is not always a BIG concern. Just remember to improve your question if you can according to [ask]. – Lev Levitsky Aug 21 '12 at 09:19
-
@Raymond: your question shows a lack of research effort. – Roland Smith Aug 21 '12 at 09:19
-
It seems like the responses to the [potential duplicate](https://stackoverflow.com/questions/889333/how-to-check-if-a-file-is-a-valid-image-file) are different and - in my case - less useful than the responses here. Going to upvote for this reason. – ximiki Dec 06 '17 at 21:01
3 Answers
You can try using PIL. But just opening a truncated JPG file won't fail, and neither will the verify
method. Trying to load it will raise an exception, though;
First we mangle a good jpg file:
> du mvc-002f.jpg
56 mvc-002f.jpg
> dd if=mvc-002f.jpg of=broken.jpg bs=1k count=20
20+0 records in
20+0 records out
20480 bytes transferred in 0.000133 secs (154217856 bytes/sec)
Then we try the Python Imaging Library:
>>> import Image
>>> im = Image.open('broken.jpg')
>>> im.verify()
>>> im = Image.open('broken.jpg') # im.verify() invalidates the file pointer
>>> im.load()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/site-packages/PIL/ImageFile.py", line 201, in load
raise IOError("image file is truncated (%d bytes not processed)" % len(b))
IOError: image file is truncated (16 bytes not processed)
As user827992 said, even a truncated image can usually still be partially decoded and shown.

- 3,852
- 18
- 30

- 42,427
- 3
- 64
- 94
-
im.show() also helps in findind an exception. It actually opens the file with the default OS img viewer and if it fails it will raise `OSError: broken data stream when reading image file` – maviz Jan 07 '17 at 05:15
-
the `load` routine seems like the only easy way I can find to check if an image is corrupt. – eggie5 Nov 30 '17 at 14:57
You could do it using PIL package:
import Image
def is_image_ok(fn):
try:
Image.open(fn)
return True
except:
return False

- 5,219
- 1
- 33
- 47
I don't think so.
The JPEG standard is more like a container rather than a standard about the implementation.
The word corrupted usually mean that the file no longer represent the original data but most of the time can still be decoded, it will produce an undefined output, not the one that is supposed to produce, but putted in a JPEG decoder most likely it is going to output something, also since there is no way to associate an unique bit arrangement to the JPEG file format you can't do this programmatically, you don't have a specific pattern and even if you have it you can't say that a bit is the wrong place or is missing without knowing what is the original content when only parsing the actual file.
Also the header of the file can be corrupted but in this case your file is probably designated as corrupted without caring about "what is", is corrupted as any generic file can be.

- 1,743
- 13
- 25