1

I believed this a simple question and looked for relative topics but I didn't find the right thing. Here is the problem:

I have two NumPy arrays for which I need to make statistic analysis by calculating some criterions, for exemple the correlation coefficient and the Nash criterion (for who are familiar with Nash). Since in the first array are observation data (the second is simulation results), I have some NaNs. I would like my programme to calculate the criterions in ignoring the value couples where the value in the first array is NaN. I tried the mask method. It worked well if I need only to deal with the first array (for calculation its average for exemple), but didn't work for comparisons of the two arrays value by value.

Could anyone give some help? Thanks!

  • 3
    Can you add some code and show us where exactly the problem is? – barak manos Feb 17 '14 at 09:31
  • Related, may be: [comparing numpy arrays containing NaN](http://stackoverflow.com/questions/10710328/comparing-numpy-arrays-containing-nan) – Ashwini Chaudhary Feb 17 '14 at 09:32
  • @barak Well I have two arrays obs & sim who have the same length and whose values match one by one (for every time step I have an observation value and a simulated value). In obs I have some NaN for the time steps where I have no observation data. Now I have to compute the correlation coefficient of the two arrays, which means I need to calculate, among other things, for every time step i, the (obs[i]-sim[i])². And for obs[i]=NaN, this equation will give NaN. So I have to ignore the obs[i] and sim[i] where obs[i]=NaN. But I don't know at all how to do this, so I can not show any code... – user3306110 Feb 17 '14 at 12:11
  • @Ashwini I have found that discussion before posting my question. Yes the problem is similar. However the answer to the question you quoted can not solve my problem, since I don't see how I can tell my code to ignore values in one of the two arrays by verifying if the two arrays are equal (even though this verification has taken acount of the NaN case) – user3306110 Feb 17 '14 at 12:16

1 Answers1

0

Just answered a similar question Numpy only on finite entries. You can replace the NaN values in you array with Numpy's isnan function, which is a common way to deal with NaN values.

import numpy as np

replace_NaN = np.isnan(array_name)
array_name[replace_NaN] = 0
Community
  • 1
  • 1
user1749431
  • 559
  • 6
  • 21
  • 1
    `np.nan_to_num(array_name)` does the same as your code. – zhangxaochen Feb 17 '14 at 10:13
  • Hello, if I replace the NaN with 0, certainly I won't have NaN anymore. But my calculations will take acount of these 0. In fact, the NaN I had initially were the time steps where I had no observation data, so I must ignore these steps for the analysis. 0 in this case is a value, but a wrong value. Except you mean I have to creat an condition for the programme to ignore all the zeros? – user3306110 Feb 17 '14 at 11:58
  • One solution would be to use if i!= 0: and adding the other values to a new array. – user1749431 Feb 17 '14 at 12:03
  • @user1749431 This can be an option. But how can I select the right values in the second array which are correspondent to the 0 of the first array? – user3306110 Feb 17 '14 at 12:22
  • If you want an intersection between the two arrays you can loop; for i data: and get i from first array, and i from second array.But I'm not sure if I follow it correctly, you have some data which has 0 occurences in some columns of your array, if you append the other values to a new array the memory of where in the data those values came from is already automatically stored. – user1749431 Feb 17 '14 at 12:50
  • @user1749431 Hello, I didn't quite understand your last post. But finally I solved the problem by a selection by index. Thank you anyway. – user3306110 Feb 21 '14 at 10:28