4

I was using OpenCV to read the images from a folder. A lot of messages like this show up:

Corrupt JPEG data: premature end of data segment
Premature end of JPEG file
Premature end of JPEG file
Premature end of JPEG file

How to catch this exception and remove these image files?

Pang
  • 9,564
  • 146
  • 81
  • 122
kli_nlpr
  • 894
  • 2
  • 11
  • 25

2 Answers2

4

Since you said you are reading 'images' (multiple images), you would be looping through files in the folder that you are reading them from. In that case, if you check if the image is valid or not by using the following :

Mat image;
image = imread(argv[1], CV_LOAD_IMAGE_COLOR);   // Read the file

if(! image.data )                              // Check for invalid input
{
    cout <<  "Could not open or find the image" << std::endl ;
    return -1;
}

you can then proceed to deleting files which are corrupt/bad.

jm'
  • 107
  • 9
2

I've been struggling to find a solution too. Read tens of articles, most of which just state that openCV does not throw errors and only outputs the error on stderr. Some suggest to use PIL, but that does not detect most of the image corruptions. Usually only premature end of file.

However the same errors that OpenCV warns about can be detected via imagemagick.

  • Install imagemagick (https://imagemagick.org/)
  • Make sure you have it in the path.
  • Put the following sub into your code and call it to verify a file from wherever you need to. It also outputs errors to stderr, however it raises an error (thanks to "-regard-warnings")
import subprocess
def checkFile(imageFile):
    try:
        subprocess.run(["identify", "-regard-warnings", imageFile]).check_returncode()
        return true
    except (subprocess.CalledProcessError) as e:
        return false

If you don't want the check to spam your outputs, add stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL params to the run function call.

On windows if you have not installed the legacy commands use the new syntax:

subprocess.run(["magick", "identify", "-regard-warnings", imageFile]).check_returncode()
Mirronelli
  • 740
  • 5
  • 14