5

I need to analyze thousands of jpeg files (by retrieving it EXIF data). It is more than 50 GB of data.I cannot read whole files because it'll take too much time.

Is there any method in C# to read only EXIF data from those files without need of loading and decompressing whole jpeg files?

EDIT: Why I need fast method?
I've tried solution from this question: How to get the EXIF data from a file using C#
And for 1000 images with total size ~ 1GB it took 3 minutes to analyze. So for larger (50G) library of photos it could take 2 hours. And when you need almost immedietelly information like: "What is preffered zoom used by your customer" it is too slow.

Community
  • 1
  • 1
Marek Kwiendacz
  • 9,524
  • 14
  • 48
  • 72

4 Answers4

8

You'll find some code samples in ExifLib - A Fast Exif Data Extractor for .NET 2.0+ (and a full project too) that shows how to read the minimum data necessary to get just the EXIF information out.

yamen
  • 15,390
  • 3
  • 42
  • 52
2

I've recently ported my Java metadata-extractor library to .NET. It's been active since 2002 and had heavy testing through widespread use. In my tests, it churns through 2GB of images, extracting all metadata within in around 4 seconds on my machine. You could optimise further by telling it to only read specific types of metadata, such as Exif. It supports many image/video formats, and many metadata types.

Available on GitHub and NuGet.

Drew Noakes
  • 300,895
  • 165
  • 679
  • 742
1

GdPicture.NET Imaging SDK starting version 10 provides a new image parsing mechanism that allows direct access to image metadata (EXIF, GPS, XMP, IPTC...) without decoding pixels. It supports more than 90 image formats including JPEG, TIFF, RAW and WebP.

Here a link the the GdPicture.NET knowledge base that demonstrates how to extract metadata using C# and VB.NET (many other languages are also supported): tutorial

In case anybody needs further information I will be glad to assist.

Disclaimer: I am the product architect of GdPicture.NET.

0

You don't need to decompress anything, the Exif information is held in the header before the image, so all you need to do is open the file, read the exif header and decode whatever it is you need. This is if you read the exif data manually (which isn't hard).

If all you need is the sizes, that is right at the front

Edit: note the exif data doesn't actualy have to be at the front, but it almost always is, so it is safe to assume that in general it will be a lot faster than if it wasn't.

Also, have you checked that using the standard API is 'too slow'? I wouldn't have thought it would take that long for 50G (or if doing it a different way would necessarily be faster).

Woody
  • 5,052
  • 2
  • 22
  • 28
  • It may not be hard... but the actual encoding method seems to be kept a secret. If anyone know how to read the EXIF info directly without using a library, I'd like to know. For instance, I can see the Make, Model, and dates stored in plain text... but looking at a hex dump I can't find the tags that supposedly mark their locations. – Mark T Mar 14 '19 at 16:25
  • The exif format is very well documented, apart from specific proprietary tags. Obviously this comment is from 7 years ago, so I don't have anything here, but if you start at the wikipedia entry for exif, it gives links to the exif site where the documentation is. it is a form of tags, the same as the tiff structure. Somewhere I have general code for pulling tags out if you need it but can't add it in a comment – Woody Mar 18 '19 at 10:52