fill/predict unknown values of matrix in python

Question

I have a process that give me some known values in some positions of the matrix, like for example number of people on a 1x1 meter cell. So if we have a room of 5x6 meters maybe we have detected an amount of people in only some of the cells.

Now we want to create a heat map or a surface representation, and so there are three options:

Set the unknown cells as unknown
Set the unknown cells as zero
Try to predict and fill the unknown cells

What i want to know i how to do the third option in python. I think that this kind of operations are common on image processing (like scaling). The idea is to make the heat map smooth with a set of limited data.

Regards.

possible duplicate of [Interpolate NaN values in a numpy array](http://stackoverflow.com/questions/6518811/interpolate-nan-values-in-a-numpy-array) — tmdavison, Aug 26 '15 at 14:44
Do you want the exact values in your 'YES' matrix, or do you just want a method that interpolates values. For example, it's pretty unclear how you've chosen the edge values. The top row is clear, but the other edges are not. In particular how did you get the `5` at position `[2,0]`, or the `6` at position `[5,2]`. If it's ok to interpolate linearly around the outside edge then fill the interior, this problem is straight-forward. If you need something different than that, then you probably need to provide more information. — farenorth, Aug 26 '15 at 15:55

farenorth · Accepted Answer · 2015-08-26T17:07:50.013

Assuming you don't need the exact edge values that are specified in the 'YES' matrix (see my comment) here is what I would do:

import numpy as np
from scipy.interpolate import griddata

nan = np.NaN
dat = np.array([[ 1,    nan,  nan,  nan,   5,],
                [ nan,  nan,  nan,  nan,  nan,],
                [ nan,  6,    nan,  nan,  8,  ],
                [ nan,  nan,  9,    nan,  nan,],
                [ nan,  nan,  nan,  nan,  nan,],
                [ 0,    nan,  nan,  nan,  2,  ]])

def fill_nans(indata, method='linear'):
    """
    Fill NaN values in the input array `indata`.
    """
    # Find the non-NaN indices
    inds = np.nonzero(~np.isnan(indata))
    # Create an `out_inds` array that contains all of the indices of indata.
    out_inds = np.mgrid[[slice(s) for s in indata.shape]].reshape(indata.ndim, -1).T
    # Perform the interpolation of the non-NaN values to all the indices in the array:
    return griddata(inds, indata[inds], out_inds, method=method).reshape(indata.shape)

out = fill_nans(dat)
print(out)

Which gives,

[[ 1.          2.          3.          4.          5.        ]
 [ 0.8         4.          5.          5.83333333  6.5       ]
 [ 0.6         6.          6.66666667  7.33333333  8.        ]
 [ 0.4         5.25        9.          7.5         6.        ]
 [ 0.2         4.5         5.          5.5         4.        ]
 [ 0.          0.5         1.          1.5         2.        ]]

You could also do,

out2 = fill_nans(dat, method='cubic')
print(out2)

Which gives,

[[ 1.          2.34765155  3.45834401  4.33986447  5.        ]
 [ 2.24368285  4.39570784  5.76911468  6.7162754   6.94217514]
 [ 2.88169911  6.          7.62769189  8.27187136  8.        ]
 [ 2.79787395  6.53998191  9.          8.99319441  7.42165234]
 [ 1.87603253  5.20787111  6.8176744   6.80953373  5.26441632]
 [ 0.          1.73565977  2.59374609  2.65495937  2.        ]]

Obviously if you just wanted integer values, you could add a .round().astype(int) to the end of the fill_nans(...) call, in which case out is:

[[1 2 3 4 5]
 [1 4 5 6 6]
 [1 6 7 7 8]
 [0 5 9 8 6]
 [0 4 5 6 4]
 [0 0 1 2 2]]

And out2 is:

[[1 2 3 4 5]
 [2 4 6 7 7]
 [3 6 8 8 8]
 [3 7 9 9 7]
 [2 5 7 7 5]
 [0 2 3 3 2]]

Again, obviously these are not exactly the same as your 'YES' matrix, but hopefully it's helpful. Best of luck!

fill/predict unknown values of matrix in python

1 Answers1