1

How can assignment of a list of elements be sped up? For scalar data, the answer would be to pre-assign an array:

import numpy as np
maxTime=1000; A=np.zeros(maxTime)
for t in range(maxTime): 
 data=get_fancy_data() 
 A[t]=data

However, if your fancy_data is a list that is a different size every timestep, then how can this be done efficiently?

python -m timeit -s "import numpy as np; N=10**4; r=np.random.random(N); A=np.zeros(N);" \
"for i in range(N): A[:i+1] = r[:i+1]"
# 100 loops, best of 3: 14.9 msec per loop

python -m timeit -s "import numpy as np; N=10**4; r=np.random.random(N); A=np.zeros(N);" \
"for i in range(N): A = r[:i+1]"
# 1000 loops, best of 3: 1.68 msec per loop

Not pre-allocating A=np.zeros(N) in the second example doesn't significantly alter the time taken.

I am not truly sure why the second example is faster. I suspect that A[:i+1] creates a copy of that part of A before assignment à la Python list slicing efficiency .

I have some code with a bottleneck in such an operation, but could not find a faster approach to this.

*

I note that this is related to another question because it concerns the meaning of A[0:2] - this is modifying the original array A, rather than creating a new array and discarding the old. However, this question concerns a way to modify an array A in a way that is faster than repeatedly making new arrays A.

Lethay
  • 11
  • 3
  • 3
    You have no lists here. Lists and NumPy arrays may be superficially similar, but they have very different operation semantics and performance characteristics. – user2357112 Jun 01 '17 at 17:12
  • 2
    Well, for starters, the two examples are not doing the same thing. The first modifies the original *array*, the second never modifies the original array, instead, it reassigns slices of the original `A` to the name `A`, destining the original array for garbage collection. – juanpa.arrivillaga Jun 01 '17 at 17:16
  • Possible duplicate of [How assignment works with python list slice](https://stackoverflow.com/questions/10623302/how-assignment-works-with-python-list-slice) – pvg Jun 01 '17 at 17:29
  • @user2357112 in this case, i don't think they do, in any meaningful way. In both cases, there is slicing as a view and slice assignment. – pvg Jun 01 '17 at 17:31
  • You can verify what @juanpa.arrivillaga is telling you by initting as `A=B=np.zeros(N)` and then looking at the contents of B after the loop completes. In the second case, you're not changing anything, just reassigning a reference over and over. Unsurprisingly, this is pretty quick, compared to moving data – pvg Jun 01 '17 at 17:35
  • 2
    @pvg: List slicing doesn't make a view. The (1-dimensional) slice assignment semantics are similar, but slice retrieval is entirely different. – user2357112 Jun 01 '17 at 17:36
  • @user2357112 'view' was a dumb choice of word. But the operational differences don't really matter here. The difference is that one of these does a lot more nothing than the other, just like it would if you were using lists. The list case would do slightly less nothing. – pvg Jun 01 '17 at 17:45
  • @juanpa.arrivillaga, that's exactly my issue. My original code would repeatedly let `A=something` and thus repeatedly throw away the old array. I reasoned that doing this would mean I'm repeatedly allocating a little bit more memory, which would have the same issue as repeatedly doing `A=np.append(A,0)`, or `vector x=0; x.push_back(0)`. I expected that preallocation and assigning to slices of `A` would be faster. It was not. Thus, I wondered if there is a way to preallocate a large amount of memory in this way with better performance than reassigning A constantly. – Lethay Jun 02 '17 at 09:00
  • @user2357112 Thank you, my apologies. Using the word `list` was a slip of a mind. Indeed, these are arrays. @pvg I think that I disagree that it's a duplicate. I know that `A[x:y]=B[:y-x]` modifies `A`, but it is surprisingly low performance and I was seeking a method that is higher performance than `A=B[y-x]` - if possible! I would also disagree (partially) that `A=r[:i+1]` only moves a reference. If I subsequently modify `A` after this, `r` remains unchanged as long as `i+1!=len(r)`. Thank you all for your comments thus far. They've been enlightening. – Lethay Jun 02 '17 at 09:08

1 Answers1

0

I couldn't solve the mysteries of why my pure pythonic approach to attempting to preallocate was not faster, so I decided to go lower-level. I decided to use Cython, so I didn't have to work out how to pass numpy arrays to pure C code without copying them.

My solution is something like this:

modify_array.pyx

import numpy as np
cimport numpy as np
cimport cython
ctypedef np.float64_t DTYPE_t
@cython.boundscheck(False) # turn off bounds-checking for entire function
@cython.wraparound(False)  # turn off negative index wrapping for entire function

cpdef foo(unsigned int T, unsigned int S, np.ndarray[DTYPE_t,ndim=2] A):
 cdef size_t t,s;
 for t in range(T): 
  for s in range(S): A[t,s]=data[t,S-s];
 return 0;

The array seems to be passed by reference, unless I am mistaken (since that concept does not exist in pure Python). In any case, any changes made to foo are reflected in the calling Python code. It's also possible to do this with numpy arrays:

modify_array_np.pyx

def baz(unsigned int T, unsigned int S, np.ndarray[DTYPE_t,ndim=2] A):
 cdef size_t t,s;
 for t in range(T): 
  for s in range(S): A[t,s]=data[t,S-s]; 
   #n.b. Do not be tempted to try a more pythonic approach like A=data[:,::-1]:
   #it will require calls to the python interpreter and gut your speed boost.
return 0;

You should avoid using any numpy or python functions - that is, you should explicitly loop over indices instead of using np.sum, in order to ensure that the algorithm can be properly converted to C code. modify_array_np.pyx is a factor of two faster than the equivalent solution I presented in the question. modify_array.pyx is a few tens of percentage faster again. It isn't as much as I hoped for, but it's better. If anyone has any faster suggestions, I'd be glad to hear them.

Using the function in Python

For reference, compile the above Cython code from the command line with,

cython modify_array.pyx

which produces a C file that you should then compile using your C compiler (see here for required includes/flags). Then, in a python file:

import modify_array
import numpy as np
T=100; A=np.zeros((T,T))

for t in range(T):
 data=get_fancy_data()
 returnValue=modify_array.foo(t,data,A) #The function modifies A directly
 #Do more stuff
Lethay
  • 11
  • 3