2

I'm searching from a simple method to interpolate a matrix that about 10% from values are NaN. For instance:

matrix = np.array([[ np.nan,  np.nan,  2.    ,  3. ,  4.    ],
                   [ np.nan,  6.    ,  7.   ,   8. ,  9.    ],
                   [ 10.   ,  11.   ,  12.   ,  13.,  14.   ],
                   [ 15.   ,  16.   ,  17.   ,  18.,  19.   ],
                   [ np.nan,  np.nan,  22.   ,  23.,  np.nan]])

I found a solution that uses griddata from scipy.interpolate, but the solution take much time. (My matrix have about 50 columns and 200,000 rows and the rate of Nan values does not higher than 10%)

Eduardo Cesar
  • 21
  • 1
  • 5
  • Possible duplicate of [Interpolate NaN values in a numpy array](http://stackoverflow.com/questions/6518811/interpolate-nan-values-in-a-numpy-array) – tmdavison Nov 08 '16 at 14:08
  • It's not a duplicate, because my problem is a linear interpolation in a Matrix (not just a list), and the performance is important. I have a solution, but take much more time than I think that it is necessary, because the NaN rate is not huge. – Eduardo Cesar Nov 08 '16 at 14:18
  • Why should the NaN density matter? Show us the `scipy.interpolate` solution. Explain why you think there should be a faster solution. Maybe even sketch such a solution. – hpaulj Nov 08 '16 at 17:51
  • I have been implemented a solution like it: http://stackoverflow.com/a/37882746/6904671 – Eduardo Cesar Nov 09 '16 at 11:14
  • I think the density is very important to complexity. If the NaN values does not occur in sequence, i.e, the NaN values is always rounded by not NaN values, interpolate them is O(1). The total average complexity is O(m x n) in this case. It is very fast in my data. If you have NaN values in sequence, one single interpolate can not be very fast, because it will have to consult the neighbors and neighbors until find non NaN end points to do a single interpolate. – Eduardo Cesar Nov 09 '16 at 11:28

0 Answers0