7

I want to use pillow for some simple handwritten image recognition, and it will be real-time so I will need to call my function 5-10 times a second. I'm loading the image and am only accessing 1 in 20^2 pixels so I really don't need all the image. I need to reduce the image loading time.

I've never used a python image library and would appreciate all suggestions.

from PIL import Image
import time

start = time.time()

im = Image.open('ir/IMG-1949.JPG')
width, height = im.size
px = im.load()

print("loading: ", time.time() - start)

desired loading time: <50ms, actual loading time: ~150ms

Fractal Salamander
  • 127
  • 1
  • 2
  • 6
  • Can you avoid JPEG - lossless image might be quicker to load? – DisappointedByUnaccountableMod Aug 26 '19 at 20:31
  • How many such images do you have? How many bytes are they on disk? What are their widths and heights? – Mark Setchell Aug 26 '19 at 21:05
  • I'm using my phone camera for it, the images are 4032 x 3024 - 1.9MB, admidettly the camera is overpowered for the task but it's the only one I have. The default format of the images is JPEG and I will probably not have more then 5 on the disk at any point of time, since they become of no interest after running the code. – Fractal Salamander Aug 27 '19 at 06:06

1 Answers1

25

Updated Answer

Since I wrote this answer, John Cupitt (author of pyvips) has come up with some improvements and corrections and fairer code and timings and has kindly shared them here. Please look at his improved version, alongside or even in preference to my code below.

Original Answer

The JPEG library has a "shrink-on-load" feature which allows a lot of I/O and decompression to be avoided. You can take advantage of this with PIL/Pillow using the Image.draft() function, so instead of reading the full 4032x3024 pixels like this:

from PIL import Image

im = Image.open('image.jpg')
px = im.load() 

which takes 297ms on my Mac, you can do the following and read 1008x756 pixels, i.e. 1/4 the width and 1/4 the height:

im = Image.open('image.jpg') 
im.draft('RGB',(1008,756)) 
px = im.load()

and that takes only 75ms, i.e. it is 4x faster.


Just for kicks, I tried comparing various techniques as follows:

#!/usr/bin/env python3 

import numpy as np 
import pyvips 
import cv2 
from PIL import Image 

def usingPIL(f): 
    im = Image.open(f) 
    return np.asarray(im) 

def usingOpenCV(f): 
    arr = cv2.imread(f,cv2.IMREAD_UNCHANGED) 
    return arr 

def usingVIPS(f): 
    image = pyvips.Image.new_from_file(f, access="sequential") 
    mem_img = image.write_to_memory() 
    imgnp=np.frombuffer(mem_img, dtype=np.uint8).reshape(image.height, image.width, 3)  
    return imgnp 

def usingPILandShrink(f): 
    im = Image.open(f)  
    im.draft('RGB',(1008,756))  
    return np.asarray(im) 

def usingVIPSandShrink(f): 
    image = pyvips.Image.new_from_file(f, access="sequential", shrink=4) 
    mem_img = image.write_to_memory() 
    imgnp=np.frombuffer(mem_img, dtype=np.uint8).reshape(image.height, image.width, 3)  
    return imgnp 

And loaded that into ipython and tested like this:

%timeit usingPIL('image.jpg')
315 ms ± 8.76 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit usingOpenCV('image.jpg')
102 ms ± 1.5 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit usingVIPS('image.jpg')
69.1 ms ± 31.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit usingPILandShrink('image.jpg')
77.2 ms ± 994 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit usingVIPSandShrink('image.jpg')                                                    
42.9 ms ± 332 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

It seems like pyVIPS is the clear winner here!

Keywords: Python, PIL, Pillow, image, image processing, JPEG, shrink-on-load, shrink on load, draft mode, read performance, speedup.

Mark Setchell
  • 191,897
  • 31
  • 273
  • 432
  • @jcupitt You are absolutely correct - my OpenCV uses `libjpeg-turbo`. I used this handy answer to check https://stackoverflow.com/a/59281440/2836621 Not sure how to disable it for OpenCV, or how to enable it for PIL and pyvips to make it fairer? – Mark Setchell Apr 24 '20 at 11:01
  • Those vips results look suspect; 300 times faster than PIL on a large image? In trying out vips on a project of mine, it's actually several times slower than PIL... – marcelm Sep 24 '22 at 19:26