21

Is there any java library that is similar to unix's command file?

ie:

$ file somepicture.png
somepicture.png PNG image, 805 x 292, 8-bit/color RGB, non-interlaced

The file command is such a nice tool. I need something that can tell me if the file is really what I want it to be. (ie a picture, document etc)

I know I can run the command file, but I am looking for a java library, not running the actual unix command.

Shervin Asgari
  • 23,901
  • 30
  • 103
  • 143

7 Answers7

18

A quick google search (for the admittedly non-obvious) "java magic file detection" brings up a fairly nice looking article, "Get the Mime Type from a File" which suggests you use one of the following:

Hasturkun
  • 35,395
  • 6
  • 71
  • 104
11

Since Java 1.7 you can use Files.probeContentType() to probe a file. Out of box it uses the mechanism on the platform to guess the content type, or you can plugin your own detector if you want.

Aubin
  • 14,617
  • 9
  • 61
  • 84
Alan
  • 2,807
  • 3
  • 22
  • 6
3

You could look at jmimemagic (tutorial). We've been using it for a while to validate uploaded images. No problems so far.

sfussenegger
  • 35,575
  • 15
  • 95
  • 119
2

I am not sure it is exactly what you are looking for, but the following link can maybe help you :

http://www.rgagnon.com/javadetails/java-0487.html

Laurent
  • 14,122
  • 13
  • 57
  • 89
  • +1 for giving the JDK-only solution. I'm not sure how good that method is, but it could be what the user is looking for. – Brett Kail Apr 28 '10 at 13:37
  • 2
    @bkail `javax.activation.MimetypesFileTypeMap` only checks the file extension which surely isn't a reliable way of determining "if the file is really what I want it to be" – sfussenegger Apr 28 '10 at 14:15
  • Yes the JDK solution is not working like `file` in unix. `file` really looks at the file and tells what it is. Doesn't care about extension like windows does – Shervin Asgari Apr 28 '10 at 17:26
  • @sfussenegger java.net.URLConnection.guessContentTypeFromStream(InputStream) surely does not use file extensions to make its determination. – Brett Kail Apr 29 '10 at 14:21
  • @bkail That wasn't mentioned in the linked article, was it? Anyway, it's not working reliably. I've just tried and it failed for a simple jpg image. – sfussenegger Apr 29 '10 at 14:53
  • @sfussenegger Huh, I thought getContent/getContentType used guessContentTypeFromStream. Ah, apparently only for "jar:" URLs. Thanks for the clarification... – Brett Kail Apr 29 '10 at 15:55
1

Have a look at mime-utils. It works with content and/or with extensions.

openCage
  • 2,735
  • 1
  • 18
  • 24
1

The closest thing in the JDK is URLConnection.guessContentTypeFromStream

finnw
  • 47,861
  • 24
  • 143
  • 221
0

Was looking for the same thing and found this: https://github.com/j256/simplemagic - it seems to be a clone of 'file', it even uses a copy of files built-in magic file.

Markus Duft
  • 151
  • 1
  • 5