How to keep JPEG metadata using PIL?

Question

I am using the library PIL to resize an image to automate the inputs for other software in my job and this software needs this metadata.

The image is correctly resized, but loses all metadata contained in it.

I used the code below to resize.

from PIL import Image
filename = r'input.jpg'
ratio=0.2
im = Image.open(filename)
out = im.resize([int(ratio * s) for s in im.size], Image.ANTIALIAS)
out.save("out.jpg", format=im.format, optimize=True)

I can see all metadata into the dict of image using:

im.__dict__

Libraries like pyexif, pyexif2, etc return Null values. I check the tag exif into im.dict and this tag is empty.

Is there any method of adding this dictionary to my resized image?

link to image download: Image

Thank you very much.

EDIT: Results for im.dict (Inside this dictionary has a XML code. It is responsible for transfer metadata for the software that my company uses). We can see The xml script in 3 tags info-->comment, app-->com and applist1.

{'im': None,
'mode': 'RGB',
'_size': (4096, 2304),
'palette': None,
'info': {'jfif': 258,
     'jfif_version': (1, 2),
         'jfif_unit': 0, 'jfif_density': (1, 1),
         'comment': b'<?xml version="1.0" encoding="utf-8"?>\n<image time="09:04:10.271222" date="2020.06.15" acq_index="7666">\n\t<Position time="20200615T090410.834" received="2020-Jun-15 09:04:10.267082" extrapolated="false" age="4" transponder_id="0">\n\t\t<Coords long="-40.9273182" lat="-22.7837963"/>\n\t\t<Depth altitude="5.14" depth="85.57"/>\n\t\t<Direction pitch="-3.09" roll="0.03" yaw="180.02"/>\n\t</Position>\n\t<acquisition>\n\t\t<exposure>5000</exposure>\n\t\t<digital_gain>1.19</digital_gain>\n\t\t<analog_gain>6</analog_gain>\n\t\t<sensor_gain>4</sensor_gain>\n\t\t<aperture>1.4</aperture>\n\t\t<focus>498</focus>\n\t\t<name>ColorCamera</name>\n\t\t<camera_session_name>start_1</camera_session_name>\n\t\t<camera_sub_session_name/>\n\t\t<focus_enc>3945</focus_enc>\n\t\t<width>4096</width>\n\t\t<height>2304</height>\n\t\t<seq_slot>0</seq_slot>\n\t\t<dequeue_time>2020-06-15T09:04:10.887848</dequeue_time>\n\t</acquisition>\n\t<errors/>\n\t<versions>\n\t\t<software>0.968s4</software>\n\t\t<fpga>0x02d1</f                    pga>\n\t\t<pic>210</pic>\n\t\t<serial_number>191</serial_number>\n\t</versions>\n\t<ntp>\n\t\t<ntpq>*192.168.99.100                   1 u   69  128  377    0.196   -3.329   0.521</ntpq>\n\t\t<state>within_limits</state>\n\t\t<sync_level>excellent_sync</sync_level>\n\t</ntp>\n\t<pps/>\n</image>\n'}, 'category': 0, 'readonly': 1, 'pyaccess': None, '_exif': None, '_min_frame': 0, 'custom_mimetype': None, 'tile': [('jpeg', (0, 0, 4096, 2304), 0, ('RGB', ''))], 
'decoderconfig': (),
'decodermaxblock': 65536,
'fp': <_io.BufferedReader name='input.jpg'>,
'filename': 'input.jpg',
'_exclusive_fp': True, 'bits': 8,
'layers': 3,
'layer': [(1, 2, 2, 0), (2, 1, 1, 1), (3, 1, 1, 1)],
'huffman_dc': {},
'huffman_ac': {},
'quantization': {0: array('B', [3, 2, 2, 3, 2, 2, 3, 3, 3, 3, 4, 3, 3, 4, 5, 8, 5, 5, 4, 4, 5, 10, 7, 7, 6, 8, 12, 10, 12, 12, 11, 10, 11, 11, 13, 14, 18, 16, 13, 14, 17, 14, 11, 11, 16, 22, 16, 17, 19, 20, 21, 21, 21, 12, 15, 23, 24, 22, 20, 24, 18, 20, 21, 20]), 
                 1: array('B', [3, 4, 4, 5                    , 4, 5, 9, 5, 5, 9, 20, 13, 11, 13, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20])}, 

'app': {'APP0': b'JFIF\x00\x01\x02\x00\x00\x01\x00\x01\x00\x00', 
        'COM': b'<?xml version="1.0" encoding="utf-8"?>\n<image time="09:04:10.271222" date="2020.06.15" acq_index="7666">\n\t<Position time="20200615T090410.834" received="2020-Jun-15 09:04:10.267082" extrapolated="false" age="4" transponder_id="0">\n\t\t<Coords long="-40.9273182" lat="-22.7837963"/>\n\t\t<Depth altitude="5.14" depth="85.57"/>\n\t\t<Direction pitch="-3.09" roll="0.03" yaw="180.02"/>\n\t</Position>\n\t<acquisition>\n\t\t<exposure>5000</exposure>\n\t\t<digital_gain>1.19</digital_gain>\n\t\t<analog_gain>6</analog_gain>\n\t\t<sensor_gain>4</sensor_gain>\n\t\t<aperture>1.4</aperture>\n\t\t<focus>498</focus>\n\t\t<name>ColorCamera</name>\n\t\t<camera_session_name>start_1</camera_session_name>\n\t\t<camera_sub_session_name/>\n\t\t<focus_enc>3945</focus_enc>\n\t\t<width>4096</width>\n\t\t<height>2304</height>\n\t\t<seq_slot>0</seq_slot>\n\t\t<dequeue_time>2020-06-15T09:04:10.887848</dequeue_time>\n\t</acquisition>\n\t<errors/>\n\t<versions>\n\t\t<software>0.968s4</software>\n\t\t<fpga>0x02d1</fpga>\               n\t\t<pic>210</pic>\n\t\t<serial_number>191</serial_number>\n\t</versions>\n\t<ntp>\n\t\t<ntpq>*192.168.99.100                   1 u   69  128  377    0.196   -3.329   0.521</ntpq>\n\t\t<state>within_limits</state>\n\t\t<sync_level>excellent_sync</sync_level>\n\t</ntp>\n\t<pps/>\n</image>\n'},

'applist': [('APP0', b'JFIF\x00\x01\x02\x00\x00\x01\x00\x01\x00\x00'), ('COM', b'<?xml version="1.0" encoding="utf-8"?>\n<image time="09:04:10.271222" date="2020.06.15" acq_index="7666">\n\t<Position time="20200615T090410.834" received="2020-Jun-15 09:04:10.267082" extrapolated="false" age="4" transponder_id="0">\n\t\t<Coords long="-40.9273182" lat="-22.7837963"/>\n\t\t<Depth altitude="5.14" depth="85.57"/>\n\t\t<Direction pitch="-3.09" roll="0.03" yaw="180.02"/>\n\t</Position>\n\t<acquisition>\n\t\t<exposure>5000</exposure>\n\t\t<digital_gain>1.19</digital_gain>\n\t\t<analog_gain>6</analog_gain>\n\t\t<sensor_gain>4</sensor_gain>\n\t\t<aperture>1.4</aperture>\n\t\t<focus>498</focus>\n\t\t<name>ColorCamera</name>\n\t\t<camera_session_name>start_1</camera_session_name>\n\t\t<camera_sub_session_name/>\n\t\t<focus_enc>3945</focus_enc>\n\t\t<width>4096</width>\n\t\t<height>2304</height>\n\t\t<seq_slot>0</seq_slot>\n\t\t<dequeue_time>2020-06-15T09:04:10.887848</dequeue_time>\n\t</acquisition>\n\t<errors/>\n\t<versions>\n\t\t<software>0.968s4</software>\n\t\t<fpga>0x02d1</fpga>\n\t\t<pic>210</pic>\n\t\t<serial_number>191</serial_number>\n\t</versions>\n\t<ntp>\n\t\t<ntpq>*192.168.99.100                   1 u   69  128  377    0.196   -3.329   0.521</ntpq>\n\t\t<state>within_limits</state>\n\t\t<sync_level>excellent_sync</sync_level>\n\t</ntp>\n\t<pps/>\n</image>\n')],

'icclist': []}

See https://stackoverflow.com/questions/4764932/in-python-how-do-i-read-the-exif-data-for-an-image to read exif. Don't know how to write exif, sorry. — Tarik, Jun 21 '20 at 20:25
Does this answer your question? [Preserve exif data of image with PIL when resize(create thumbnail)](https://stackoverflow.com/questions/17042602/preserve-exif-data-of-image-with-pil-when-resizecreate-thumbnail) — Mike Scotty, Jun 21 '20 at 21:09
This solution didnt wotk because every time that I get exif data, the result is an empty value. I only can see the data using im.__dict__ — Vinicius Nogueira, Jun 21 '20 at 23:21

score 3 · Answer 1 · answered Jun 21 '20 at 23:08

3

You could add the metadata extracted from your original file to the resized one by means of the exif keyword argument. If you modify your code like this:

from PIL import Image

filename = r'input.jpg'
ratio = 0.2

im = Image.open(filename)
EXIF = im.getexif()

out = im.resize([int(ratio * s) for s in im.size], Image.ANTIALIAS)
out.save("out.jpg", format=im.format, optimize=True, exif=EXIF)

The metadata should be now transferred to the new image.

answered Jun 21 '20 at 23:08

panadestein

1,241
10
21

I did it an didn't work. When I use anything to get exif metadata return null. I can see the metadata only with im.__dict__, – Vinicius Nogueira Jun 21 '20 at 23:19
1

@ViniciusNogueira that is strange. I am using a [test](https://upload.wikimedia.org/wikipedia/commons/c/c6/Kimi_Raikkonen_2006_test.jpg) JPG image, when I reduce the size and reimport the `out.jpg` image I am able to retrieve the metadata with `im.getexif()`. I am also able to see the metadata with Gnome Image Viewer, for example. Can you try printing th metadata of the *same* initial test image I proposed you? – panadestein Jun 21 '20 at 23:28
With your image work great. The format of my image is the problem (Dont have exif tags, it is another kind of metadata, but I dont know how to handling with it yet). I made the upload, Can you try with it? https://1drv.ms/u/s!Av58o7S5NWNXjP4QCPuyZ2PzJGlonQ?e=tgRohl My results are filename = r'input.jpg' im = Image.open(filename) EXIF = im.getexif() print(EXIF) {} – Vinicius Nogueira Jun 21 '20 at 23:39
using gdalinfo [Results to Input](https://1drv.ms/u/s!Av58o7S5NWNXjP4RixQUUlwKLKuyow?e=vQwC2i) and [Results to Output](https://1drv.ms/u/s!Av58o7S5NWNXjP4SHdgIQWVtlIMJDQ?e=qhhdcp) . I cant see the metadata COMMENT to out.jpg – Vinicius Nogueira Jun 21 '20 at 23:55
@ViniciusNogueira I have tried the original image you shared in OneDrive both with Python and with Gnome Image Viewer, it does not have any metadata, that is why it can't be transferred. – panadestein Jun 22 '20 at 00:25
I know that it doesn't have the exif metadata, but it has a comment with XML in metadata (I updated the post with results from im.__dict__). Look the results using GDAL tool ([GDAL_INFO_INPUT](https://1drv.ms/u/s!Av58o7S5NWNXjP4RixQUUlwKLKuyow?e=vQwC2i)) is the same that im.__dict__ – Vinicius Nogueira Jun 22 '20 at 09:59

Mark Setchell · Answer 2 · 2020-06-22T11:00:14.823

Maybe you could use wand instead of PIL as it propagates the comment forward for you automagically:

#!/usr/bin/env python3

from wand.image import Image

with Image(filename='input.jpg') as img: 
    img.save(filename='result.jpg')

Or, here's a possible work-around. You can extract the comment from input.jpg into a file called comment.txt like this:

jhead -cs comment.txt input.jpg

You can then write that comment into a different file called result.jpg like this:

jhead -ci comment.txt result.jpg

I assume you could use the Python subprocess module to copy forward your data using something like:

import subprocess

# Propagate JPEG comment forward from "input.jpg" to "result.jpg"
subprocess.run('jhead -cs - input.jpg | jhead -ci - result.jpg', shell=True)

score 0 · Accepted Answer · answered Jun 24 '20 at 10:38

Thank you all.

@Mark Setchell thank you so much, your solution using wand and ImageMagick work fine. But in my job, not everyone deals with software installation, programming, etc. So I decided to use only standard python stuffs.

I made a simple solution using only PIL. First I read and resized the image with the code below:

from PIL import Image

raw_image = Image.open('input')
out = raw_image.resize([int(ratio * s) for s in raw_image.size], Image.ANTIALIAS)
out.save('out.jpg', format=raw_image.format, quality=100, optimize=True)

After save the resized image, I just got the input image comment field and insert into the resized image. To do this, I insert the header b'\xff\xfe' and append the comment field from input image using a bytestring.

with open('out.jpg', 'r+b') as f:
    img = f.read()
    data = img[:2] + b'\xff\xfe\x04\xb5' + raw_image.app['COM'] + img[2:]
    f.seek(0)
    f.truncate()
    f.write(data)
    f.close()

I belive that I can improve a lot this code, but, first, I have to study much more about coding and python. Also, when use this code with multiprocessing we decrease a lot the time of processing.

Thank you all for the help and the discussion.

Regards, Vincius

How to keep JPEG metadata using PIL?

3 Answers3