13

Please help me to find out the type of the file which is being uploaded. I wanted to distinguish between excel type and csv.

MIMEType returns same for both of these file. Please help.

Anuradha
  • 145
  • 1
  • 1
  • 6
  • 2
    Maybe duplicate of http://stackoverflow.com/questions/2729038/is-there-a-java-library-equivalent-to-file-command-in-unix . At least you can find your answer there – Shervin Asgari Jan 03 '11 at 09:52

6 Answers6

19

I use Apache Tika which identifies the filetype using magic byte patterns and globbing hints (the file extension) to detect the MIME type. It also supports additional parsing of file contents (which I don't really use).

Here is a quick and dirty example on how Tika can be used to detect the file type without performing any additional parsing on the file:

import java.io.File;
import java.io.FileInputStream;
import java.io.InputStream;
import java.util.HashMap;

import org.apache.tika.metadata.HttpHeaders;
import org.apache.tika.metadata.Metadata;
import org.apache.tika.metadata.TikaMetadataKeys;
import org.apache.tika.mime.MediaType;
import org.apache.tika.parser.AutoDetectParser;
import org.apache.tika.parser.ParseContext;
import org.apache.tika.parser.Parser;
import org.xml.sax.helpers.DefaultHandler;

public class Detector {

    public static void main(String[] args) throws Exception {
        File file = new File("/pats/to/file.xls");

        AutoDetectParser parser = new AutoDetectParser();
        parser.setParsers(new HashMap<MediaType, Parser>());

        Metadata metadata = new Metadata();
        metadata.add(TikaMetadataKeys.RESOURCE_NAME_KEY, file.getName());

        InputStream stream = new FileInputStream(file);
        parser.parse(stream, new DefaultHandler(), metadata, new ParseContext());
        stream.close();

        String mimeType = metadata.get(HttpHeaders.CONTENT_TYPE);
        System.out.println(mimeType);
    }

}
buge
  • 375
  • 1
  • 7
9

I hope this will help. Taken from an example not from mine:

import javax.activation.MimetypesFileTypeMap;
import java.io.File;

class GetMimeType {
  public static void main(String args[]) {
    File f = new File("test.gif");
    System.out.println("Mime Type of " + f.getName() + " is " +
                         new MimetypesFileTypeMap().getContentType(f));
    // expected output :
    // "Mime Type of test.gif is image/gif"
  }

}

Same may be true for excel and csv types. Not tested.

Neigyl R. Noval
  • 6,018
  • 4
  • 27
  • 45
5

I figured out a cheaper way of doing this with java.nio.file.Files

public String getContentType(File file) throws IOException {
        return Files.probeContentType(file.toPath());
}

- or -

public String getContentType(Path filePath) throws IOException {
        return Files.probeContentType(filePath);
}

Hope that helps.

Cheers.

tmwanik
  • 1,643
  • 14
  • 20
  • 3
    Be careful, because it's OS dependent! My mac wasn't even able to detect a css file's MIME type. – Lakatos Gyula Aug 14 '13 at 13:52
  • Unfortunately, this only examines the file extension (at least on Ubuntu 20). If a file has a different extension for some reason or no extension at all, this solution will not work. – guninvalid Jan 01 '22 at 07:47
2

A better way without using javax.activation.*:

 URLConnection.guessContentTypeFromName(f.getAbsolutePath()));
b_erb
  • 20,932
  • 8
  • 55
  • 64
2

If you are already using Spring this works for csv and excel:


import org.springframework.mail.javamail.ConfigurableMimeFileTypeMap;

import javax.activation.FileTypeMap;
import java.io.IOException;

public class ContentTypeResolver {

    private FileTypeMap fileTypeMap;

    public ContentTypeResolver() {
        fileTypeMap = new ConfigurableMimeFileTypeMap();
    }

    public String getContentType(String fileName) throws IOException {
        if (fileName == null) {
            return null;
        }
        return fileTypeMap.getContentType(fileName.toLowerCase());
    }

}

or with javax.activation you can update the mime.types file.

Ceren
  • 71
  • 7
1

The CSV will start with text and the excel type is most likely binary.

However the simplest approach is to try to load the excel document using POI. If this fails try to load the file as a CSV, if that fails its possibly neither type.

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130