Suppose we have an input array with some (but not all) nan values from which we want to write into a nan-initialized output array. After writing non-nan data into the output array there are still nan values and I don't understand at all why:
# minimal example just for testing purposes
import numpy as np
# fix state of seed
np.random.seed(1000)
# create input array and nan-filled output array
a = np.random.rand(6,3,5)
b = np.zeros((6,3,5)) * np.nan
x = [np.arange(6),1,2]
# select data in one dimension with others fixed
y_temp = a[x]
# set arbitrary index to nan
y_temp[1] = np.nan
ind_valid = ~np.isnan(y_temp)
# select non-nan values
y = y_temp[ind_valid]
# write input to output at corresponding indices
b[x][ind_valid] = y
print b[x][ind_valid]
# surprise, surprise :(
# [ nan nan nan nan nan nan]
# workaround (that will of course cost computation time, even if not much)
c = np.zeros(len(y_temp)) * np.nan
c[ind_valid] = y
b[x] = c
print b[x][ind_valid]
# and this is what we want to have
# [ 0.39719446 nan 0.39820488 0.68190824 0.86534558 0.69910395]
I thought the array b
would reserve some block in memory and by indexing with x
it "knows" those indices. Then it should also know them when selecting only some of them with ind_valid
and be able to write into exactly those bit addresses in memory. No idea, but maybe it's sth. similar as python nested list unexpected behaviour? Please explain and maybe also provide a nice solution instead of the proposed workaround! Thanks!