Let's say I have a huge 2D array looking schematically like this:
test = np.array([[0.1, 0.3, 0.5, 0.2, 5., np.nan, np.nan],
[2., 0.8, 0.1, 3., 2.5, 0.9, np.nan]])
As it's huge, I want to merge entries along an axis, but that their sum should become at least bigger than a certain value, say 1 in this case. The merged entry should take the lowest index of the merged group of entries and the rest filled with NaN:
np.array([[1.1, 5., np.nan, np.nan, np.nan, np.nan, np.nan],
[2.9, 3., 3.4, np.nan, np.nan, np.nan, np.nan]])
I know this is somehow possible by looping through one dimension of the array, assigning indices to a list, thresholding, merging and then padding, but this seems rather complicated to me. I tried also to use np.apply_along_axis
with something like this:
digi = np.digitize(test[0], np.arange(0, np.nanmax(test[0]), 0.05), right=True)
np.bincount(digi, weights=test[0])
according to answers here and here, but the result is also just loosely related to what I want. Is there a simpler way to formulate this?