I am searching for a java library which tells you the mime type by looking at the file content(byte array). I found this project using jmimemagic and it no longer supports newer file types (eg. MS word docx format) as it is inactive now (from 2006).
-
3http://sourceforge.net/projects/mime-util/files/mime-util/mime-util-2.1.3/ – khachik Dec 03 '10 at 18:56
-
2I don't think this a duplicate to the referenced question because the author asks explicitly for detection by the file content whereas the solutions for the other question refer to a file (inclusive file name). – danielp Feb 02 '15 at 11:15
-
What about the solutions [here](http://www.rgagnon.com/javadetails/java-0487.html). Do they not work for you? – javamonkey79 Dec 04 '10 at 01:02
-
mime util is not working for microsoft docx files. Mime util is reporting it as application/zip. I am expect more specific one, something like application/vnd.openxmlformats-officedocument.wordprocessingml.document. I found apache tika working for me. – Ajith Jose Dec 05 '10 at 05:05
-
The code snippet shown in the link works but not for files like Microsoft docx. – Ajith Jose Dec 05 '10 at 15:01
3 Answers
Maybe useful for someone, who needs the most used office formats as well (and does not use Apache Tika):
public class MimeTypeUtils {
private static final Map<String, String> fileExtensionMap;
static {
fileExtensionMap = new HashMap<String, String>();
// MS Office
fileExtensionMap.put("doc", "application/msword");
fileExtensionMap.put("dot", "application/msword");
fileExtensionMap.put("docx", "application/vnd.openxmlformats-officedocument.wordprocessingml.document");
fileExtensionMap.put("dotx", "application/vnd.openxmlformats-officedocument.wordprocessingml.template");
fileExtensionMap.put("docm", "application/vnd.ms-word.document.macroEnabled.12");
fileExtensionMap.put("dotm", "application/vnd.ms-word.template.macroEnabled.12");
fileExtensionMap.put("xls", "application/vnd.ms-excel");
fileExtensionMap.put("xlt", "application/vnd.ms-excel");
fileExtensionMap.put("xla", "application/vnd.ms-excel");
fileExtensionMap.put("xlsx", "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet");
fileExtensionMap.put("xltx", "application/vnd.openxmlformats-officedocument.spreadsheetml.template");
fileExtensionMap.put("xlsm", "application/vnd.ms-excel.sheet.macroEnabled.12");
fileExtensionMap.put("xltm", "application/vnd.ms-excel.template.macroEnabled.12");
fileExtensionMap.put("xlam", "application/vnd.ms-excel.addin.macroEnabled.12");
fileExtensionMap.put("xlsb", "application/vnd.ms-excel.sheet.binary.macroEnabled.12");
fileExtensionMap.put("ppt", "application/vnd.ms-powerpoint");
fileExtensionMap.put("pot", "application/vnd.ms-powerpoint");
fileExtensionMap.put("pps", "application/vnd.ms-powerpoint");
fileExtensionMap.put("ppa", "application/vnd.ms-powerpoint");
fileExtensionMap.put("pptx", "application/vnd.openxmlformats-officedocument.presentationml.presentation");
fileExtensionMap.put("potx", "application/vnd.openxmlformats-officedocument.presentationml.template");
fileExtensionMap.put("ppsx", "application/vnd.openxmlformats-officedocument.presentationml.slideshow");
fileExtensionMap.put("ppam", "application/vnd.ms-powerpoint.addin.macroEnabled.12");
fileExtensionMap.put("pptm", "application/vnd.ms-powerpoint.presentation.macroEnabled.12");
fileExtensionMap.put("potm", "application/vnd.ms-powerpoint.presentation.macroEnabled.12");
fileExtensionMap.put("ppsm", "application/vnd.ms-powerpoint.slideshow.macroEnabled.12");
// Open Office
fileExtensionMap.put("odt", "application/vnd.oasis.opendocument.text");
fileExtensionMap.put("ott", "application/vnd.oasis.opendocument.text-template");
fileExtensionMap.put("oth", "application/vnd.oasis.opendocument.text-web");
fileExtensionMap.put("odm", "application/vnd.oasis.opendocument.text-master");
fileExtensionMap.put("odg", "application/vnd.oasis.opendocument.graphics");
fileExtensionMap.put("otg", "application/vnd.oasis.opendocument.graphics-template");
fileExtensionMap.put("odp", "application/vnd.oasis.opendocument.presentation");
fileExtensionMap.put("otp", "application/vnd.oasis.opendocument.presentation-template");
fileExtensionMap.put("ods", "application/vnd.oasis.opendocument.spreadsheet");
fileExtensionMap.put("ots", "application/vnd.oasis.opendocument.spreadsheet-template");
fileExtensionMap.put("odc", "application/vnd.oasis.opendocument.chart");
fileExtensionMap.put("odf", "application/vnd.oasis.opendocument.formula");
fileExtensionMap.put("odb", "application/vnd.oasis.opendocument.database");
fileExtensionMap.put("odi", "application/vnd.oasis.opendocument.image");
fileExtensionMap.put("oxt", "application/vnd.openofficeorg.extension");
}
public static String getContentTypeByFileName(String fileName) {
// 1. first use java's buildin utils
FileNameMap mimeTypes = URLConnection.getFileNameMap();
String contentType = mimeTypes.getContentTypeFor(fileName);
// 2. nothing found -> lookup our in extension map to find types like ".doc" or ".docx"
if (!StringUtils.hasText(contentType)) {
String extension = FilenameUtils.getExtension(fileName);
contentType = fileExtensionMap.get(extension);
}
return contentType;
}
}

- 3,327
- 4
- 42
- 52
-
I had to change `if (!StringUtils.hasText(contentType)) {` to `if (StringUtils.isBlank(contentType)) {`. Thanks for the codez! – Joshua Pinter Mar 18 '14 at 05:10
Use Apache tika for content detection. Please find the link below. http://tika.apache.org/0.8/detection.html. We have so many jar dependencies which you can find when you build tika using maven
ByteArrayInputStream bai = new ByteArrayInputStream(pByte);
ContentHandler contenthandler = new BodyContentHandler();
Metadata metadata = new Metadata();
Parser parser = new AutoDetectParser();
try {
parser.parse(bai, contenthandler, metadata);
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (SAXException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (TikaException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
System.out.println("Mime: " + metadata.get(Metadata.CONTENT_TYPE));
return metadata.get(Metadata.CONTENT_TYPE);

- 87,898
- 29
- 167
- 228

- 723
- 2
- 9
- 22
-
Please NOTE: As with most Apache libraries, this one will impregnate your dependency to oblivion. It could be a concern if you are using the shade maven plugin to create an uber jar. – TheRealChx101 Apr 27 '20 at 13:18
I use javax.activation.MimetypesFileTypeMap
. It starts with a small set: $JRE_HOME/lib/content-types.properties
, but you can add you own. Create a file mime.types
in the format shown in MimetypesFileTypeMap
's javadoc (I started with a large list from the net, massaged it, and added types I found missing). Now you can add that in your code by opening your mime.types
file and adding its contents to your map. However the easier solution is to add your mime.types
file to the META-INF
of your jar. java.activation
will pick that up automagically.

- 9,678
- 13
- 71
- 102

- 898
- 13
- 24