0

Suppose we have an input array with some (but not all) nan values from which we want to write into a nan-initialized output array. After writing non-nan data into the output array there are still nan values and I don't understand at all why:

# minimal example just for testing purposes

import numpy as np

# fix state of seed
np.random.seed(1000)
# create input array and nan-filled output array
a = np.random.rand(6,3,5)
b = np.zeros((6,3,5)) * np.nan

x = [np.arange(6),1,2]
# select data in one dimension with others fixed
y_temp = a[x]
# set arbitrary index to nan
y_temp[1] = np.nan
ind_valid = ~np.isnan(y_temp)
# select non-nan values
y = y_temp[ind_valid]

# write input to output at corresponding indices
b[x][ind_valid] = y
print b[x][ind_valid]
# surprise, surprise :(
# [ nan  nan  nan  nan  nan  nan]

# workaround (that will of course cost computation time, even if not much)
c = np.zeros(len(y_temp)) * np.nan
c[ind_valid] = y
b[x] = c
print b[x][ind_valid]
# and this is what we want to have
# [ 0.39719446         nan  0.39820488  0.68190824  0.86534558  0.69910395]

I thought the array b would reserve some block in memory and by indexing with x it "knows" those indices. Then it should also know them when selecting only some of them with ind_valid and be able to write into exactly those bit addresses in memory. No idea, but maybe it's sth. similar as python nested list unexpected behaviour? Please explain and maybe also provide a nice solution instead of the proposed workaround! Thanks!

bproxauf
  • 1,076
  • 12
  • 23
  • `x = [np.arange(6),1,2]` will create a fancy index, thus, a *copy* is made, not a slice-view. Note, Python list slices *always* copy, and `numpy.ndarray` slices create views instead. – juanpa.arrivillaga Feb 13 '18 at 10:34
  • Ok, I'm using this fancy index, because in general I want to select one dimension from the N-D array while looping through other dimensions like `for coord in itertools.product(*index_grids): z_old = list(x_old + list(coord))` x_old would be for example the np.arange(6) and coord would be (1,2) – bproxauf Feb 13 '18 at 10:36
  • The issue is that the number of dimensions can principally be arbitrary. Any suggestions for the assignment of the data into the output? – bproxauf Feb 13 '18 at 10:39
  • I'm not sure I completely understand you, but unless you use slices, you cannot create views, and copies will have to be made. – juanpa.arrivillaga Feb 13 '18 at 10:43
  • But according to what you wrote b[x] should already be a copy not a view into b. So why does the second example work? – bproxauf Feb 13 '18 at 10:44
  • Because you do `b[x] = c`? – juanpa.arrivillaga Feb 13 '18 at 10:46
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/165035/discussion-between-bproxauf-and-juanpa-arrivillaga). – bproxauf Feb 13 '18 at 10:46

0 Answers0