I know this looks too simple but I couldn’t find a straight forward solution.
Once saved, the itxt should be compressed again.
I know this looks too simple but I couldn’t find a straight forward solution.
Once saved, the itxt should be compressed again.
It's not so simple as you eyeballed it. If it were, you might have found out there is no straightforward solution.
Let's start with the basics.
An important question, because modifying an existing PNG file is a large task. Reading its documentation, it doesn't start out well:
PNG: Chunk by Chunk
Ancillary Chunks
.. iTXt
Ignored when reading. Not generated.
(https://pythonhosted.org/pypng/chunk.html)
But lower on that page, salvation!
Non-standard Chunks
Generally it is not possible to generate PNG images with any other chunk types. When reading a PNG image, processing it using the chunk interface,png.Reader.chunks
, will allow any chunk to be processed (by user code).
So all I have to do is write this 'user code', and PyPNG can do the rest. (Oof.)
iTXt
chunk?Let's take a peek at what you are interested in.
4.2.3.3. iTXt International textual data
.. the textual data is in the UTF-8 encoding of the Unicode character set instead of Latin-1. This chunk contains:
Keyword: 1-79 bytes (character string) Null separator: 1 byte Compression flag: 1 byte Compression method: 1 byte Language tag: 0 or more bytes (character string) Null separator: 1 byte Translated keyword: 0 or more bytes Null separator: 1 byte Text: 0 or more bytes
(http://www.libpng.org/pub/png/spec/1.2/PNG-Chunks.html#C.iTXt)
Looks clear to me. The optional compression ought not be a problem, since
.. [t]he only value presently defined for the compression method byte is 0, meaning zlib ..
and I am pretty confident there is something existing for Python that can do this for me.
Back to PyPNG's chunk handling then.
PyPNG offers an iterator, so indeed checking if a PNG contains an iTXt
chunk is easy:
chunks()
Return an iterator that will yield each chunk as a (chunktype, content) pair.
(https://pythonhosted.org/pypng/png.html?#png.Reader.chunks)
So let's write some code in interactive mode and check. I got a sample image from http://pmt.sourceforge.net/itxt/, repeated here for convenience. (If the iTXt
data is not conserved here, download and use the original.)
>>> import png
>>> imageFile = png.Reader("itxt.png")
>>> print imageFile
<png.Reader instance at 0x10ae1cfc8>
>>> for c in imageFile.chunks():
... print c[0],len(c[1])
...
IHDR 13
gAMA 4
sBIT 4
pCAL 44
tIME 7
bKGD 6
pHYs 9
tEXt 9
iTXt 39
IDAT 4000
IDAT 831
zTXt 202
iTXt 111
IEND 0
Success!
What about writing back? Well, PyPNG is usually used to create complete images, but fortunately it also offers a method to explicitly create one from custom chunks:
png.write_chunks(out, chunks)
Create a PNG file by writing out the chunks.
So we can iterate over the chunks, change the one(s) you want, and write back the modified PNG.
iTXt
dataThis is a task in itself. The data format is well described, but not suitable for Python's native unpack
and pack
methods. So we have to invent something ourself.
The text strings are stored in ASCIIZ format: a string ending with a zero byte. We need a small function to split on the first 0
:
def cutASCIIZ(str):
end = str.find(chr(0))
if end >= 0:
result = str[:end]
return [str[:end],str[end+1:]]
return ['',str]
This quick-and-dirty function returns an array of a [before, after] pair, and discards the zero itself.
To handle the iTXt
data as transparently as possible, I make it a class:
class Chunk_iTXt:
def __init__(self, chunk_data):
tmp = cutASCIIZ(chunk_data)
self.keyword = tmp[0]
if len(tmp[1]):
self.compressed = ord(tmp[1][0])
else:
self.compressed = 0
if len(tmp[1]) > 1:
self.compressionMethod = ord(tmp[1][1])
else:
self.compressionMethod = 0
tmp = tmp[1][2:]
tmp = cutASCIIZ(tmp)
self.languageTag = tmp[0]
tmp = tmp[1]
tmp = cutASCIIZ(tmp)
self.languageTagTrans = tmp[0]
if self.compressed:
if self.compressionMethod != 0:
raise TypeError("Unknown compression method")
self.text = zlib.decompress(tmp[1])
else:
self.text = tmp[1]
def pack (self):
result = self.keyword+chr(0)
result += chr(self.compressed)
result += chr(self.compressionMethod)
result += self.languageTag+chr(0)
result += self.languageTagTrans+chr(0)
if self.compressed:
if self.compressionMethod != 0:
raise TypeError("Unknown compression method")
result += zlib.compress(self.text)
else:
result += self.text
return result
def show (self):
print 'iTXt chunk contents:'
print ' keyword: "'+self.keyword+'"'
print ' compressed: '+str(self.compressed)
print ' compression method: '+str(self.compressionMethod)
print ' language: "'+self.languageTag+'"'
print ' tag translation: "'+self.languageTagTrans+'"'
print ' text: "'+self.text+'"'
Since this uses zlib
, it requires an import zlib
at the top of your program.
The class constructor accepts 'too short' strings, in which case it will use defaults for everything undefined.
The show
method lists the data for debugging purposes.
With all of this, now examining, modifying, and adding iTXt
chunks finally is straightforward:
import png
import zlib
# insert helper and class here
sourceImage = png.Reader("itxt.png")
chunkList = []
for chunk in sourceImage.chunks():
if chunk[0] == 'iTXt':
itxt = Chunk_iTXt(chunk[1])
itxt.show()
# modify existing data
if itxt.keyword == 'Author':
itxt.text = 'Rad Lexus'
itxt.compressed = 1
chunk = [chunk[0], itxt.pack()]
chunkList.append (chunk)
# append new data
newData = Chunk_iTXt('')
newData.keyword = 'Custom'
newData.languageTag = 'nl'
newData.languageTagTrans = 'Aangepast'
newData.text = 'Dat was leuk.'
chunkList.insert (-1, ['iTXt', newData.pack()])
with open("foo.png", "wb") as file:
png.write_chunks(file, chunkList)
When adding a totally new chunk, be careful not to append
it, because then it will appear after the required last IEND
chunk, which is an error. I did not try but you should also probably not insert it before the required first IHDR
chunk or (as commented by Glenn Randers-Pehrson) in between consecutive IDAT
chunks.
Note that according to the specifications, all texts in iTXt
should be UTF8 encoded.