Efficiently combining Numpy arrays, growing into new dimension

Question

How can I efficiently create a numpy tensor that collects many matrices and stacks them, always adding new matrices. This is useful for managing batches of images, for example, where each image is a 3D matrix. Stacking N, 3D images (one dimension for each of the RGB planes) together creates a 4D matrix.

Here is the base form of simply appending to matrices into a new dimension, creating a final matrix of higher dimension than the original two. And here is some information on the np.newaxis functionality.

They probably have something against self-answered questions. Self-answered questions are [explicitly okay and officially encouraged](https://stackoverflow.com/help/self-answer), guys. — user2357112, Nov 02 '17 at 02:20
@user2357112 - Thanks. I actually post this stuff because I couldn't find a specific answer to my question and so want to save other people the time it cost me. Also, I hope someone maybe thinks of a better way! :) — n1k31t4, Nov 02 '17 at 02:22
You can also use `np.fromiter`', I've benchmarked it in an answer here. https://stackoverflow.com/questions/46860970/why-use-numpy-over-list-based-on-speed/46868693#46868693. It compares well to using numpy lists. — user2699, Nov 02 '17 at 13:59

n1k31t4 · Accepted Answer · 2017-11-02T02:20:16.130

There are two good ways I know of, which use some ways from the links of the question as building blocks.

If you create an array e.g. with 3 dimensions and on the fly would like to keep appending these singular results to one another, thus creating a 4d tensor, then you require a small amount of initial setup for the first array before you can apply the answers posted in the question.

One approach is to store all the single matrices in a list (appending as you go) and then simply to combine them using np.array:

def list_version(input_data, N):

    outputs = []
    for i in range(N):
        one_matrix = function_creating_single_matrix(arg1, arg2)
        outputs.append(one_matrix)

    return np.array(outputs)

A second approach extends the very first matrix (e.g. that is 3d) to become 4d, using np.newaxis. Subsequent 3d matrices can then be np.concatenated one-by-one. They must also be extended in the same dimension as the very first matrix - the final result grows in this dimension. Here is an example:

def concat_version(input_data, N):

    for i in range(N):
        if i == 0:
            results = function_creating_single_matrix(arg1, arg2)
            results = results[np.newaxis,...]    # add the new dimension
        else:
            output = function_creating_single_matrix(arg1, arg2)
            results = np.concatenate((results, output[np.newaxis,...]), axis=0)
            # the results are growing the the first dimension (=0)

    return results

I also compared the variants in terms of performance using Jupyter %%timeit cells. In order to make the pseudocode above work, I just created simple matrices filled with ones, to be appended to one another:

function_creating_single_matrix() = np.ones(shape=(10, 50, 50))

I can then also compare the results to ensure it is the same.

%%timeit -n 100
resized = list_version(N=100)
# 4.97 ms ± 25.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%%timeit -n 100
resized = concat_version(N=100)
# 96.6 ms ± 144 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

So it seems that the list method is ~20 times faster! ...at least on these scales of matrix-magnitude

Here we see that the functions return identical results:

list_output = list_version(N=100)
concat_output = concat_version(N=100)
np.array_equal(list_output, concat_output)
# True

I also ran cProfile on the functions, and it seems the reason is that np.concatenate spends a lot of its time copying the matrices. I then traced this back to the underlying C code.

Links:

Here is a similar question that stacks several arrays, but assuming they are already in existence, not being generated and appended on the fly.
Here is some more discussion on the memory management and speed of the above mentioned methods.

Calling `concatenate` in a loop is terrible for performance because every call has to allocate an entire new array and copy the arguments into the new array. Gathering them into a list and building the final array in one call is the way to go. — user2357112, Nov 02 '17 at 02:22
@n1k31t4 thanks for sharing. unfortunately, the competitive culture in this site (or the brain damage due to programming in excess) is to destroy the efforts of others. — dawid, Feb 14 '21 at 22:12

Efficiently combining Numpy arrays, growing into new dimension

1 Answers1

Links: