2

I'm trying to find a way to perform operations on each elements across multiple 2D-arrays without having to loop over them. Or at least, not needing two for loops. My code calculates the standard deviation of each pixel over a series of images (arrays). Now, the amount of images there are is not the problem, it is the size of the arrays, making the code take extremely slow. The following is a working example of what I have.

import numpy as np

# reshape(# of image (arrays),# of rows, # of cols) 
a = np.arange(32).reshape(2,4,4)

stddev_arr = np.array([])
for i in range(4):
    for j in range(4): 
        pixel = a[0:,i,j]
        stddev = np.std(pixel) 
        stddev_arr = np.append(stddev_arr, stddev)

My actual data is 2000x2000, making this code loop 4000000 times. Is there a better way to do this? Any advice is extremely appreciated.

martineau
  • 119,623
  • 25
  • 170
  • 301
Mints
  • 103
  • 8

2 Answers2

4

You're already using numpy. numpy's std() function takes an axis argument that tells it what axis you want it to operate on (in this case the zeroth axis). Because this offloads the calculation to numpy's C-backend (and possibly using SIMD optimizations for your processor that vectorize a lot of operations), it's so much faster than iterating. Another time-consuming operation in your code is when you append to stddev_arr. Appending to numpy arrays is slow because the entire array is copied into new memory before the new element is added. Now you already know how big that array needs to be, so you might as well preallocate it.

a = np.arange(32).reshape(2, 4, 4)
stdev = np.std(a, axis=0)

This gives a 4x4 array

array([[8., 8., 8., 8.],
       [8., 8., 8., 8.],
       [8., 8., 8., 8.],
       [8., 8., 8., 8.]])

To flatten this into a 1D array, do flat_stdev = stdev.flatten().

Comparing the execution times:

# Using only numpy
def fun1(arr):
    return np.std(arr, axis=0).flatten()

# Your function
def fun2(arr):
    stddev_arr = np.array([])
    for i in range(arr.shape[1]):
        for j in range(arr.shape[2]): 
            pixel = arr[0:,i,j]
            stddev = np.std(pixel) 
            stddev_arr = np.append(stddev_arr, stddev)
    return stddev_arr


# Your function, but pre-allocating stddev_arr
def fun3(arr):
    stddev_arr = np.zeros((arr.shape[1] * arr.shape[2],))
    x = 0
    for i in range(arr.shape[1]):
        for j in range(arr.shape[2]): 
            pixel = arr[0:,i,j]
            stddev = np.std(pixel) 
            stddev_arr[x] = stddev
            x += 1
    return stddev_arr

First, let's make sure all these functions are equivalent:

a = np.random.random((3, 10, 10))
assert np.all(fun1(a) == fun2(a))
assert np.all(fun1(a) == fun3(a))

Yup, all give the same result. Now, let's try with a bigger array.

a = np.random.random((3, 100, 100))

x = timeit.timeit('fun1(a)', setup='from __main__ import fun1, a', number=10)
# x: 0.003302899989648722

y = timeit.timeit('fun2(a)', setup='from __main__ import fun2, a', number=10)
# y: 5.495519500007504

z = timeit.timeit('fun3(a)', setup='from __main__ import fun3, a', number=10)
# z: 3.6250679999939166

Wow! We get a ~1.5x speedup just by preallocating. Even more wow: using numpy's std() with the axis argument gives a > 1000x speedup, and this is just for the 100x100 array! With bigger arrays, you can expect to see even bigger speedup.

Pranav Hosangadi
  • 23,755
  • 7
  • 44
  • 70
  • This has been really helpful, thank you very much! By using pre-allocating I have a ~12x speedup! – Mints Nov 09 '20 at 19:16
1

So based on what you have provided, you can reshape your array in another way to vectorize it to replace your two loops. Then you only have to use np.std once on the axis that you want.

a = np.arange(32).reshape(2, 4, 4)

a = a.reshape(2, -1).transpose()

stddev_arr = np.std(a, axis=1)
Luke B
  • 1,143
  • 1
  • 11
  • 22
  • 1
    Why does it need to be reshaped or transposed? `np.std(np.arange(32).reshape(2,4,4), axis=0)` works just fine. – Pranav Hosangadi Nov 05 '20 at 19:19
  • Yes, but your solution gives a 2d array, which have to be reshaped anyway. It's down to preference and I prefer to do it before the operation. Your solution works as well. – Luke B Nov 05 '20 at 19:20
  • 2
    Can you start this solution from `a = np.arange(32).reshape(2,4,4)` which was OP's sample dataset. Presumably that matches the real array being worked with. – tdelaney Nov 05 '20 at 19:26