32

Given some source file (or more generic - input stream), I need to find out

  • is it an image
  • if it is an image, then retrieve its type (png/jpeg/gif/etc)
  • retrieve exif data, if available

I looked at the API, but it is not clear how to get the type of image or Exif data.

Matthias Braun
  • 32,039
  • 22
  • 142
  • 171
jdevelop
  • 12,176
  • 10
  • 56
  • 112

3 Answers3

34

Last time I had to do this, a couple of years ago, the standard API couldn't read EXIF data. This library can do so though:

http://www.drewnoakes.com/code/exif/

OpenSauce
  • 8,533
  • 1
  • 24
  • 29
17

Easy answer: Use https://github.com/drewnoakes/metadata-extractor/

If you're crazy/brave/curious, you could get image type from the stream by reading the first few bytes (these are magic numbers). I believe the exif is generally at the start of the stream too.

Drew Noakes
  • 300,895
  • 165
  • 679
  • 742
matt burns
  • 24,742
  • 13
  • 105
  • 107
  • 2
    Magic number link → Error 403 Forbidden. – 猫IT Aug 25 '19 at 08:09
  • Here the `magic number` link from the internet archive https://web.archive.org/web/20170611142413/https://www.astro.keele.ac.uk/oldusers/rno/Computing/File_magic.html – SubOptimal Aug 08 '23 at 09:05
8

It's an old thread, but I was doing this recently and found the Apache Tika library useful. Particularly for analysing generic streams to detect what content is in them.

Thought it might help others.

http://tika.apache.org/

Eurospoofer
  • 614
  • 10
  • 7