5

I have 16-bit PGM images that I am trying to read in Python. It seems (?) like PIL does not support this format?

import Image
im = Image.open('test.pgm')
im.show()

Shows roughly the image, but it isn't right. There are dark bands throughout and img is reported to have mode=L. I think this is related to an early question I had about 16-bit TIFF files. Is 16-bit that rare that PIL just does not support it? Any advice how I can read 16-bit PGM files in Python, using PIL or another standard library, or home-grown code?

Community
  • 1
  • 1
mankoff
  • 2,225
  • 6
  • 25
  • 42

3 Answers3

4

You need a mode of "L;16"; however it looks like PIL has a mode of "L" hardcoded into File.c when loading a PGM. You’d have to write your own decoder if you want to be able to read a 16-bit PGM.

However, 16-bit image support still seems flaky:

>>> im = Image.fromstring('I;16', (16, 16), '\xCA\xFE' * 256, 'raw', 'I;16') 
>>> im.getcolors()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.6/dist-packages/PIL/Image.py", line 866, in getcolors
    return self.im.getcolors(maxcolors)
ValueError: image has wrong mode

I think PIL is capable of reading images with 16 bits, but actually storing and manipulating them is still experimental.

>>> im = Image.fromstring('L', (16, 16), '\xCA\xFE' * 256, 'raw', 'L;16') 
>>> im
<Image.Image image mode=L size=16x16 at 0x27B4440>
>>> im.getcolors()
[(256, 254)]

See, it just interpreted the 0xCAFE value as 0xFE, which isn’t exactly correct.

Josh Lee
  • 171,072
  • 38
  • 269
  • 275
  • I'm happy to just read them. If I need to write I will use PNG. I am also OK with manipulating them as data in numpy rather than as an image in PIL. Your post has been helpful, but can you expand on how I can correctly read in the data? – mankoff Sep 09 '11 at 16:10
  • Do you mean writing a decoder for PIL, or how to interpret the PGM? – Josh Lee Sep 09 '11 at 16:14
  • Your italicized '''reading''' made me think there is some trick to perhaps make it work as-is? I'm trying to adapt the work-around here (http://stackoverflow.com/questions/7247371/python-and-16-bit-tiff) but without losing bits. If a custom decoder is needed I will write it based on the PIL tutorial. The PGM format seems pretty basic, so perhaps I should just read it directly into numpy... – mankoff Sep 09 '11 at 16:27
  • I’m not sure. In my test, it correctly parsed the file but was unable to represent its contents exactly. I don’t think using a custom decoder would have a different result than using fromstring. Numpy is probably a better bet. – Josh Lee Sep 09 '11 at 16:36
2

The following depends only on numpy to load the image, which can be 8-bit or 16-bit raw PGM/PPM. I also show a couple different ways to view the image. The one that uses PIL (import Image) requires that the data first be converted to 8-bit.

#!/usr/bin/python2 -u

from __future__ import print_function
import sys, numpy

def read_pnm_from_stream( fd ):
   pnm = type('pnm',(object,),{}) ## create an empty container
   pnm.header = fd.readline()
   pnm.magic = pnm.header.split()[0]
   pnm.maxsample = 1 if ( pnm.magic == 'P4' ) else 0
   while ( len(pnm.header.split()) < 3+(1,0)[pnm.maxsample] ): s = fd.readline() ; pnm.header += s if ( len(s) and s[0] != '#' ) else ''
   pnm.width, pnm.height = [int(item) for item in pnm.header.split()[1:3]]
   pnm.samples = 3 if ( pnm.magic == 'P6' ) else 1
   if ( pnm.maxsample == 0 ): pnm.maxsample = int(pnm.header.split()[3])
   pnm.pixels = numpy.fromfile( fd, count=pnm.width*pnm.height*pnm.samples, dtype='u1' if pnm.maxsample < 256 else '>u2' )
   pnm.pixels = pnm.pixels.reshape(pnm.height,pnm.width) if pnm.samples==1 else pnm.pixels.reshape(pnm.height,pnm.width,pnm.samples)
   return pnm

if __name__ == '__main__':

## read image
 # src = read_pnm_from_stream( open(filename) )
   src = read_pnm_from_stream( sys.stdin )
 # print("src.header="+src.header.strip(), file=sys.stderr )
 # print("src.pixels="+repr(src.pixels), file=sys.stderr )

## write image
   dst=src
   dst.pixels = numpy.array([ dst.maxsample-i for i in src.pixels ],dtype=dst.pixels.dtype) ## example image processing
 # print("dst shape: "+str(dst.pixels.shape), file=sys.stderr )
   sys.stdout.write(("P5" if dst.samples==1 else "P6")+"\n"+str(dst.width)+" "+str(dst.height)+"\n"+str(dst.maxsample)+"\n");
   dst.pixels.tofile( sys.stdout ) ## seems to work, I'm not sure how it decides about endianness

## view using Image
   import Image
   viewable = dst.pixels if dst.pixels.dtype == numpy.dtype('u1') else numpy.array([ x>>8 for x in dst.pixels],dtype='u1')
   Image.fromarray(viewable).show()

## view using scipy
   import scipy.misc
   scipy.misc.toimage(dst.pixels).show()

Usage notes

  • I eventually figured out "how it decides about endianness" -- it is actually storing the image in memory as big-endian (rather than native). This scheme might slow down any non-trivial image processing -- although other performance issues with Python may relegate this concern to a triviality (see below).

  • I asked a question related to the endianness concern here. I also ran into some interesting confusion related to endianness with this because I was testing by preprocessing the image with pnmdepth 65535 which is not good (by itself) for testing endianness since the low and high bytes might end up being the same (I didn't notice right away because print(array) outputs decimal). I should have also applied pnmgamma to save myself some confusion.

  • Because Python is so slow, numpy tries to be sneakyclever about how it applies certain operations (see broadcasting). The first rule of thumb for efficiency with numpy is let numpy handle iteration for you (or put another way don't write your own for loops). The funny thing in the code above is that it only partially follows this rule when doing the "example image processing", and therefore the performance of that line has an extreme dependency on the parameters that were given to reshape.

  • The next big numpy endianness mystery: Why does newbyteorder() seem to return an array, when it's documented to return a dtype. This is relevant if you want to convert to native endian with dst.pixels=dst.pixels.byteswap(True).newbyteorder().

  • Hints on porting to Python 3: binary input with an ASCII text header, read from stdin

Community
  • 1
  • 1
Brent Bradburn
  • 51,587
  • 17
  • 154
  • 173
  • Why does trying to write seemingly trivial Python programs seem to always result in an odyssey through Stack Overflow? – Brent Bradburn Dec 05 '15 at 21:53
  • One of the things that drives me crazy about Python is shallow copies, such as `dst=src` above. Sometimes I think that Python is just too difficult for a C++ programmer to understand. – Brent Bradburn Sep 10 '16 at 21:30
  • ...I found some of the lowest voted answers [here](http://stackoverflow.com/questions/9541025/how-to-copy-a-python-class) to be the most useful. In particular, it looks like I can solve my problem above by doing `dst=src()`. – Brent Bradburn Sep 10 '16 at 23:13
1

Here's a generic PNM/PAM reader based on NumPy and an undocumented function in PyPNG.

def read_pnm( filename, endian='>' ):
   fd = open(filename,'rb')
   format, width, height, samples, maxval = png.read_pnm_header( fd )
   pixels = numpy.fromfile( fd, dtype='u1' if maxval < 256 else endian+'u2' )
   return pixels.reshape(height,width,samples)

Of course writing this image format generally doesn't require the assistance of a library...

Community
  • 1
  • 1
Brent Bradburn
  • 51,587
  • 17
  • 154
  • 173
  • I borrowed some ideas from [this related question](http://stackoverflow.com/questions/7368739/numpy-and-16-bit-pgm). – Brent Bradburn Feb 03 '13 at 01:35
  • With regards to 'PAM' support, the `read_pnm_header()` function used here doesn't return `TUPLTYPE`, but it does return the correct value for `DEPTH` (which I called `samples`). – Brent Bradburn Feb 03 '13 at 02:01
  • See [this question](http://stackoverflow.com/questions/2850893/reading-binary-data-from-stdin) for important notes on using stdio instead of a file. – Brent Bradburn Feb 05 '13 at 19:17
  • It's been a while, and I'm not sure where `endian` was supposed to have come from. I think it should just be replaced with `'>'` to indicate that the file is stored as big-endian (per the [standard](https://en.wikipedia.org/w/index.php?title=Netpbm_format&oldid=672069881#16-bit_extensions)). – Brent Bradburn Jul 19 '15 at 18:40