0

For example, let's say I have an array of shape (2,3,4,5), and I want to add a (2,3,4,1) array to this to produce an array of shape (2,3,4,6).

What is the most efficient way (for large dimensions)?

Is there something better than dimshuffle and vstack/hstack/dstack?

(Python 2.7)

capybaralet
  • 1,757
  • 3
  • 21
  • 31

2 Answers2

1

Here are various ways of doing this, with accompanying benchmarks:

a = np.zeros([100,200,300,5])
b = np.zeros([100,200,300,1])

%timeit c=np.concatenate([a,b],-1)
#1 loops, best of 3: 241 ms per loop

%timeit c=np.vstack([a.T,b.T]).T
#1 loops, best of 3: 309 ms per loop

%timeit c=np.empty([100,200,300,5]); c[...,:5]=a; c[...,5:]=b
#1 loops, best of 3: 311 ms per loop

# Assuming c was already allocated:
%timeit c[...,:5]=a; c[...,5:]=b
#10 loops, best of 3: 113 ms per loop

These times are all quite comparable, and all quite slow. If all the arrays were in the transposed order, we could do a bit better:

va = np.zeros([5,300,200,100])
vb = np.zeros([1,300,200,100])

%timeit vc=np.concatenate([va,vb],0)
#1 loops, best of 3: 191 ms per loop

%timeit vc=np.vstack([va,vb])
#1 loops, best of 3: 284 ms per loop

%timeit vc=np.empty([6,300,200,100]); vc[:5]=va; vc[5:]=vb
#1 loops, best of 3: 281 ms per loop

#Assuming vc is already allocated. This case is somehow
#much faster than the others!
%timeit vc[:5]=va; vc[5:]=vb
#10 loops, best of 3: 26.4 ms per loop

#Somehow the time for allocating vc and for copying the
#values does not add up. I guess this has to do with
#caching working better when the same buffer is reused
%timeit vc=np.empty([6,300,200,100])
#100000 loops, best of 3: 7.73 µs per loop

Implementing the same operation in fortran and calling it via f2py produced times of about 55 ms just for the assignment for the untransposed case. So it seems none of these options are horribly inefficient. I would recommend np.concatenate. It is general, and slightly faster than the equivalent *stack for some reason. That is, unless you can preallocate and reuse the output array, in which case assignment with broadcasing is faster by at least a factor 2.

amaurea
  • 4,950
  • 26
  • 35
0

Use arr.resize(new_shape=(2,3,4,6)). This will attempt to re-allocate the existing memory used by the original array, thus it is potentially faster than any other method which is guaranteed to return a newly-allocated array.

The downside is that it isn't always possible to do this in-place, in which case you have no choice, but to create a new array, e.g. using numpy.append.

Further reading about resize and some caveats here.

Community
  • 1
  • 1
shx2
  • 61,779
  • 13
  • 130
  • 153