0

I was using java ImageIO and BufferedImage for some image operations and wanted to see how it behaves when I output the image as a jpg. There I found some interesting behaviour that I can't quite explain. I have the following code. The code reads an image and then outputs the same image as "copy.jpg" in the same folder. The code is in Kotlin, but the functions used are java functions:

val image = File("some/image/path.jpg")
val bufImage = ImageIO.read(image.inputStream())
FileOutputStream(File(image.parentFile, "copy.jpg")).use { os ->
    ImageIO.write(bufImage, "jpg", os)
}

I would expect it to output the exactly same file, except maybe the meta information. However the resulting file was almost a tenth of the original file. I doubt the meta information would be that much. The exact size difference varied depending on which image file I used, however every time the output image would be smaller. But I could not see a quality difference to the old file. When zooming in I would see the same pixels.

Why is the file size reduced so dramatically?

findusl
  • 2,454
  • 8
  • 32
  • 51
  • 1
    JPG is a lossy compression format. That could be why. – Jarvis Jan 03 '20 at 14:42
  • Instead of ImageIO.write, you probably want to do a direct byte buffer copy using another file writing operation if you want the exact same file. Inherently, using ImageIO.write probably applies an additional compression to the final file based on the extension. – Jarvis Jan 03 '20 at 14:44
  • The JPEG format has a number of options. For instance, if you export an image as a JPEG from [GIMP](https://www.gimp.org), it will ask you for the image quality setting, which affects how much the image is compressed. You are almost certainly saving your image with a greater compression factor than the original image used. – VGR Jan 03 '20 at 16:15
  • I think if you did a subtraction of the two images you'll find some differences. Also, what created your original jpg? Some cameras will use jpg with a much higher quality than java's default value. Also, every time an image is open and saved using jpg compression with some lose value, it will lose some information. – matt Jan 03 '20 at 18:27
  • Meta data for a JPEG file is typically the same size, regardless of the image dimensions/compression.. So, for a small image, the meta data could occupy most of the file... But without knowing anything about what's in the original JPEG, this is of course just speculation. – Harald K Jan 06 '20 at 08:57

1 Answers1

0

JPEG is lossy compression: it throws away lots of information in order to keep the file small.  (An uncompressed image file could be orders of magnitude larger.)

It's intended to throw away information that you're not likely to see or care about, of course; but it still loses some image data.

And the loss is generational: if you have an image that came from a JPEG file, and then recompress it to a JPEG file, it will usually lose more data, giving a worse-quality result than the first JPEG file — even if the compression settings are exactly the same.  (Trying to approximate an already-compressed image won't work the same as trying to approximate the original source image. And there's no way to recover information which is already lost!)

That's almost certainly what's happening here.  Your code reads a JPEG file and expands it into a BufferedImage (which holds the uncompressed image data), and then compresses it again into a new JPEG file, which loses further quality.  It's probably using a lot higher compression than the first file used, hence the smaller size.

I'd be surprised if you couldn't see any difference between the two JPEG files in an image viewer or editor, when magnified.  (JPEG artefacts are most obvious around sharp edges and boundaries, but if you know what to look for you can sometimes see them elsewhere.  Subtle changes can be easier to see if you can line up both images on the exact same area of screen and flip directly between them.)

You can control how much information is lost when creating a JPEG — but the ImageIO.write() method you're using doesn't provide a way to do that.  See this question for how to do it.  (It's in Java, but you should be able to follow it.)

Obviously, the more information you're prepared to lose, the smaller file you can end up with.  But note that if you choose a high-quality setting, the result could be a lot larger than the first JPEG, even though it will probably still lose slightly more quality.

(That's why, if you're doing any sort of processing on an image, it's best to keep it in lossless formats until the very end, and compress to a lossy format like JPEG only once, to avoid losing quality each time you save and reload.)

As you indicate, another reason could be the loss of non-image data — you're unlikely to notice the loss of metadata such as camera settings, but the file could have had a sizeable thumbnail image too.

gidds
  • 16,558
  • 2
  • 19
  • 26