Fastest way to read multiple images and store into a single numpy array

Question

I have an application where I'll be repeating the following set of operations many times:

Operations:
-> Read N images (all have the same dimension (H,W))
-> Normalize each image to (0,1)
-> Store these N images in a single numpy array
-> Return the array (of shape (N, H, W))

Translating this into code, it would be something like:

def load_block(im_paths, H, W):
    N = len(im_paths)
    im_block = np.empty((N, H, W), dtype=np.float32) 

    for i, im_path in enumerate(im_paths):
        image = cv2.imread(im_path, 0)
        im_block[i, :, :] = (image-image.min())/(image.max()-image.min())
    return im_block

So I want to speed up this process. My initial go to would be numba, however I'm not sure if it'll be of use here since I'm doing I/O ops.

Seems relevant - [Fastest approach to read thousands of images into one big numpy array](https://stackoverflow.com/questions/44078327/fastest-approach-to-read-thousands-of-images-into-one-big-numpy-array). — Divakar, Oct 27 '20 at 18:23
How big is `N`? Do you really need them all in one Numpy array? What sort of CPU and disk susbsystem do you have? — Mark Setchell, Oct 27 '20 at 18:45
@Divakar seems like the accepted answer is doing exactly what I'm doing thus far. Still thanks though, it is relevant indeed. — Mercury, Oct 27 '20 at 21:30
@MarkSetchell N is variable, with an upper bound at 800, and I really do need them in a single array. I'm thinking of a system independent, general solution; may have to dig into multiprocessing and shared arrays. — Mercury, Oct 27 '20 at 21:30

score 1 · Accepted Answer · answered Oct 27 '20 at 18:29

1

I don't think numba can help with the image loading. It possibly could help with the normalization. Perhaps the following might gain you something. It's certainly worth a try.

images = [cv2.imread(im_path) for im_path in im_paths]

@njit
def load_block(images, H, W):
     ... loop over images rather than image paths ...

answered Oct 27 '20 at 18:29

Frank Yellin

9,127
1
12
22

This does seem promising. I can probably call vstack at the end to bundle the arrays together. I'll try this and let you know. – Mercury Oct 27 '20 at 21:36
What were the results? – Frank Yellin Oct 29 '20 at 20:54
Yeah, a jitted function looping over already loaded images provided some definite gains. – Mercury Oct 29 '20 at 20:57

Fastest way to read multiple images and store into a single numpy array

1 Answers1