You could use stride.as_strided
:
import numpy.lib.stride_tricks as stride
s = u.strides[0]
H2 = stride.as_strided(u, shape=(N-q+1,q), strides=(s, s)).astype(complex)
Using strides=(s, s)
is the key -- in particular, making the first stride s
means that each row of H2
advances the index into u
by the number of bytes needed to advance one item. Hence the rows repeat, albeit shifted by one.
For example,
import numpy as np
import numpy.lib.stride_tricks as stride
N, q = 10**2, 6
u = np.arange((N-q+1)*(N))
def using_loop(u):
H = np.zeros(shape=(N-q+1,q),dtype=complex)
for i in range(0,N-q+1):
H[i,:] = u[i:q+i]
return H
def using_stride(u):
s = u.strides[0]
H2 = stride.as_strided(u, shape=(N-q+1,q), strides=(s, s)).astype(complex)
return H2
H = using_loop(u)
H2 = using_stride(u)
assert np.allclose(H, H2)
Since stride.as_strided
avoids the Python for-loop
, using_stride
is faster than using_loop
. The advantage grows as N-q
(the number of iterations) increases.
With N = 10**2 using_stride
is 5x faster:
In [119]: %timeit using_loop(u)
10000 loops, best of 3: 61.6 µs per loop
In [120]: %timeit using_stride(u)
100000 loops, best of 3: 11.9 µs per loop
With N = 10**3 using_stride
is 28x faster:
In [122]: %timeit using_loop(u)
1000 loops, best of 3: 636 µs per loop
In [123]: %timeit using_stride(u)
10000 loops, best of 3: 22.4 µs per loop