What made me fall head-over-heels in love with Python was actually NumPy and specifically its amazing indexing and indexing routines!
In test_extra_crispy()
we can use zip()
to get our ducks (initial conditions) in a row, then indexing using offsets to do the "transplanting" of blocks of values:
i_values = np.arange(7)
istarts = (i_values * n2 / 2).astype(int)
for i, istart in zip(i_values, istarts):
tChunked[i, :n2] = t[istart:istart+n2]
See also
We can see that for
t = np.arange(10000000)
n1 = 7
"extra crispy" is a lot faster than the original (91 vs 4246 ms), but only a little faster than test2()
from ken's answer which is not significant considering that it does more careful checking than my brute force treatment.
However "extra crispy" avoids adding an additional axis and expanding the memory usage. For modest arrays this doesn't matter at all. But if you have a huge array then disk space and access time becomes a real problem.
"Just use Jit"?
Zaero Divide's answer points out that @jit goes a long way towards converting impractical python nested loops to fast compiled do loops.
But always check out what methods are already available in NumPy which are already running efficiently with compiled code. One can easily get a gigaflop on a laptop for problems amenable to NumPy's many handy methods.
In my case test_jit()
ran four times slower than "extra crispy" (350 vs 91 ms)!
You shouldn't write script that depends on @jit to run fast if you don't need to; not everyone wants to "just use @jit".
Strangely shaped indexing
If you need to address a more random-shaped volume within an array, you can use indexing like this:
array = np.array([[0, 0, 1, 0, 0], [0, 1, 0, 1, 0], [1, 0, 0, 0, 1], [0, 1, 0, 1, 0], [0, 0, 1, 0, 0]])
print(array)
gives
[[0 0 1 0 0]
[0 1 0 1 0]
[1 0 0 0 1]
[0 1 0 1 0]
[0 0 1 0 0]]
and we can get indices for the 1's like this:
i, j = np.where(array == 1)
print(i)
print(j)
which gives
If we want to start with a zeroed array and insert those 1's via numpy indexing, just do this
array = np.zeros((5, 5), dtype=int)
array[i, j] = 1
which recreates the original array.
import numpy as np
import matplotlib.pyplot as plt
import time
def test_original(n1, t):
n2 = int(2*t.size / (n1 + 1))
tChunked = np.zeros(shape = (n1, n2))
for i in range(n1):
istart = int(i * n2 / 2)
for j in range(0, n2):
tChunked[i, j] = t[istart + j]
return tChunked
t = np.arange(10000000)
n1 = 7
t_start = time.process_time()
tc_original = test_original(n1, t)
print('original process time (ms)', round(1000*(time.process_time() - t_start), 3))
# print('tc_original.shape: ', tc_original.shape)
fig, ax = plt.subplots(1, 1)
for thing in tc_original:
ax.plot(thing)
plt.show()
def test_extra_crispy(n1, t):
n2 = int(2*t.size / (n1 + 1))
tChunked = np.zeros(shape = (n1, n2))
i_values = np.arange(7)
istarts = (i_values * n2 / 2).astype(int)
for i, istart in zip(i_values, istarts):
tChunked[i, :n2] = t[istart:istart+n2]
return tChunked
t_start = time.process_time()
tc_extra_crispy = test_extra_crispy(n1, t)
print('extra crispy process time (ms)', round(1000*(time.process_time() - t_start), 3))
# print('tc_extra_crispy.shape: ', tc_extra_crispy.shape)
print('np.all(tc_extra_crispy == tc_original): ', np.all(tc_extra_crispy == tc_original))
import math
def test2(n1, t): # https://stackoverflow.com/a/72492815/3904031
n2 = int(2 * t.size / (n1 + 1))
istart = np.linspace(0, math.ceil(n1 * n2 / 2), num=n1, endpoint=False, dtype=np.int32)
jstart = np.linspace(0, n2, num=n2, endpoint=False, dtype=np.int32)
k = istart[:, np.newaxis] + jstart # Note: I switched i and j.
tChunked = t[k] # This creates an array of the same shape as k.
return tChunked
t_start = time.process_time()
tc_test2 = test2(n1, t)
print('test2 process time (ms)', round(1000*(time.process_time() - t_start), 3))
# print('tc_test2.shape: ', tc_test2.shape)
print('np.all(tc_test2 == tc_original): ', np.all(tc_test2 == tc_original))
# from https://stackoverflow.com/a/72494777/3904031
from numba import njit
@njit
def test_jit(n1,t):
n2 = int(2*t.size/(n1+1))
print(n2)
tChunked = np.zeros(shape = (n1,n2))
for i in range(0,n1):
istart = int(i*n2/2)
for j in range(0,n2):
tChunked[i,j] = t[istart+j]
return tChunked
t_start = time.process_time()
tc_jit = test_jit(n1, t)
print('jit process time (ms)', round(1000*(time.process_time() - t_start), 3))
# print('tc_jit.shape: ', tc_jit.shape)
print('np.all(tc_jit == tc_original): ', np.all(tc_jit == tc_original))