5

I want to validate if all the files in a directory are of a certain type. What I did so far is.

private static final String[] IMAGE_EXTS = { "jpg", "jpeg" };

private void validateFolderPath(String folderPath, final String[] ext) {

        File dir = new File(folderPath);

        int totalFiles = dir.listFiles().length;

        // Filter the files with JPEG or JPG extensions.
        File[] matchingFiles = dir.listFiles(new FileFilter() {
            public boolean accept(File pathname) {
                return pathname.getName().endsWith(ext[0])
                        || pathname.getName().endsWith(ext[1]);
            }
        });

        // Check if all the files have JPEG or JPG extensions
        // Terminate if validation fails.
        if (matchingFiles.length != totalFiles) {
            System.out.println("All the tiles should be of type " + ext[0]
                    + " or " + ext[1]);
            System.exit(0);
        } else {
            return;
        }

    }

This works fine if the file name have an extension like {file.jpeg, file.jpg} This fails if the files have no extensions {file1 file2}. When I do the following in my terminal I get:

$ file folder/file1 
folder/file1: JPEG image data, JFIF standard 1.01

Update 1:

I tried to get the magic numbers of the file to check if it is JPEG:

for (int i = 0; i < totalFiles; i++) {
            DataInputStream input = new DataInputStream(
                    new BufferedInputStream(new FileInputStream(
                            dir.listFiles()[i])));

            if (input.readInt() == 0xffd8ffe0) {
                isJPEGFlag = true;
            } else {
                isJPEGFlag = false;
                try {
                    input.close();
                } catch (IOException ignore) {
                }
                System.out.println("File not JPEG");
                System.exit(0);
            }
        }

I ran into another problem. There are some .DS_Store files in my folder. Any idea how to ignore them ?

yesh
  • 2,052
  • 4
  • 28
  • 51
  • 3
    You mean how do you verify if the file having no extension is a JPEG file or not? – Kalpak Gadre Oct 11 '12 at 18:01
  • Just because a filename ends with a particular extension does not mean that the _content_ of that file correspond to its name. You need to read the content of the file (at least the first N bytes) -- that's what the 'file' command does... – Art Swri Oct 11 '12 at 18:03
  • Did anyone notice when Windows had a penchant for creating JPEG images with a `.jpe` extension? AFAIR it was saving images direct out of IE, but my memory is a bit hazy. – Andrew Thompson Oct 11 '12 at 18:41
  • Changes look ok except I would wrap your streams in using blocks so the connections are closed after reading each file. – emalamisura Oct 12 '12 at 16:05

3 Answers3

3

Firstly, file extensions are not mandatory, a file without extension could very well be a valid JPEG file.

Check the RFC for JPEG format, the file formats generally start with some fixed sequence of bytes to identify the format of the file. This is definitely not straight forward, but I am not sure if there is a better way.

In a nutshell you have to open each file, read first n bytes depending on file format, check if they match to file format you expect. If they do, its a valid JPEG file even if it has an exe extension or even if it does not have any extension.

Kalpak Gadre
  • 6,285
  • 2
  • 24
  • 30
2

For JPEGs you can do the magic number check in header of the file:

static bool HasJpegHeader(string filename)
{
    using (BinaryReader br = new BinaryReader(File.Open(filename, FileMode.Open)))
    {
        UInt16 soi = br.ReadUInt16();
        UInt16 jfif = br.ReadUInt16();      
        return soi == 0xd8ff && jfif == 0xe0ff;
    }
}

More complete method here which covers EXIFF as well: C# How can I test a file is a jpeg?

Community
  • 1
  • 1
emalamisura
  • 2,816
  • 2
  • 29
  • 34
  • Do JPEG's have a formal header ? your approach is interesting but not sure if it will work for JPEGS. – yesh Oct 11 '12 at 18:11
  • I made some update. Can you tell me if I am going in the right way ? – yesh Oct 11 '12 at 20:55
2

One good (though expensive) check for validity as an image understood by J2SE is to try to ImageIO.read(File) it. That methods throws some quite helpful exceptions if it does not find an image in the file provided.

Andrew Thompson
  • 168,117
  • 40
  • 217
  • 433