Data
Lets take the following 2d array:
starts = [0, 4, 10, 13, 23, 27]
ends = [4, 10, 13, 23, 27, 32]
lengths = [4, 6, 3, 10, 4, 5]
arr = np.array([starts, ends, lengths]).T
Thus looking like:
[[ 0 4 4]
[ 4 10 6]
[10 13 3]
[13 23 10]
[23 27 4]
[27 32 5]]
Goal
Now I want to "loop" through the lengths
and as soon as soon as the cumaltive sum reaches 10
I want to output the starts
and ends
and then restart the cumulative counting.
Working code
tot_size = 0
start = 0
for i, numb in enumerate(arr[:,-1]):
# update sum
tot_size += numb
# Check if target size is reached
if tot_size >= 10:
start_loc, end_loc = arr[:,0][start], arr[:,1][i]
print('Start: {}\nEnd: {}\nSum: {}\n'.format(start_loc, end_loc, tot_size))
start = i + 1
tot_size = 0
# Last part
start_loc, end_loc = arr[:,0][start], arr[:,1][i]
print('Start: {}\nEnd: {}\nSum: {}\n'.format(start_loc, end_loc, tot_size))
Which will print:
Start: 0 End: 10 Sum: 10
Start: 10 End: 23 Sum: 13
Start: 23 End: 32 Sum: 9
(I don't need to know the resulting sum but I do need to know the starts
and ends
)
Numpy try
I suppose there must be a much more straightforward, or a vectorized, way of doing this with numpy.
cumsum
+remainder
I was thinking of something like np.remainder(np.cumsum(arr[:,-1]), 10)
however it will be "hard" to say when something is close to the target number (10
here), which is different from just splitting when sum > x
stride_tricks
Since the above doesn't work in a window I thought of stides but these windows are of fixed sizes
All ideas are welcome :)