4

I need to interpolate multi index dataframe:

for example:

this is the main dataframe:

a    b    c    result
1    1    1    6
1    1    2    9
1    2    1    8
1    2    2    11
2    1    1    7
2    1    2    10
2    2    1    9
2    2    2    12

I need to find the result for:

1.3    1.7    1.55    

What I've been doing so far is appending a pd.Series inside with NaN for each index individually.

As you can see. this seems like a VERY inefficient way.

I would be happy if someone can enrich me.

P.S. I spent some time looking over SO, and if the answer is in there, I missed it:

Fill multi-index Pandas DataFrame with interpolation

Resampling Within a Pandas MultiIndex

pandas multiindex dataframe, ND interpolation for missing values

Fill multi-index Pandas DataFrame with interpolation

Algorithm:

stage 1:

a    b    c    result
1    1    1    6
1    1    2    9
1    2    1    8
1    2    2    11
1.3    1    1    6.3
1.3    1    2    9.3
1.3    2    1    8.3
1.3    2    2    11.3
2    1    1    7
2    1    2    10
2    2    1    9
2    2    2    12

stage 2:

a    b    c    result
1    1    1    6
1    1    2    9
1    2    1    8
1    2    2    11
1.3    1    1    6.3
1.3    1    2    9.3
1.3    1.7    1    7.7
1.3    1.7    2    10.7
1.3    2    1    8.3
1.3    2    2    11.3
2    1    1    7
2    1    2    10
2    2    1    9
2    2    2    12

stage 3:

a    b    c    result
1    1    1    6
1    1    2    9
1    2    1    8
1    2    2    11
1.3    1    1    6.3
1.3    1    2    9.3
1.3    1.7    1    7.7
1.3    1.7    1.55    9.35
1.3    1.7    2    10.7
1.3    2    1    8.3
1.3    2    2    11.3
2    1    1    7
2    1    2    10
2    2    1    9
2    2    2    12
TrebledJ
  • 8,713
  • 7
  • 26
  • 48
umn
  • 431
  • 6
  • 17
  • what does each stage mean? and what do you mean by need to find results for '1.3 1.7 1.55 '? – Jessica Dec 20 '18 at 15:58
  • the stages I wrote down was my current method for solving the problem. The 4th column is the actual value for the three first column. Imagine it as as 4D function... f(x,y,z) = w – umn Dec 20 '18 at 16:41

1 Answers1

4

You can use scipy.interpolate.LinearNDInterpolator to do what you want. If the dataframe is a MultiIndex with the column 'a','b' and 'c', then:

from scipy.interpolate import LinearNDInterpolator as lNDI
print (lNDI(points=df.index.to_frame().values, values=df.result.values)([1.3, 1.7, 1.55]))

now if you have dataframe with all the tuples (a, b, c) as index you want to calculate, you can do for example:

def pd_interpolate_MI (df_input, df_toInterpolate):
    from scipy.interpolate import LinearNDInterpolator as lNDI
    #create the function of interpolation
    func_interp = lNDI(points=df_input.index.to_frame().values, values=df_input.result.values)
    #calculate the value for the unknown index
    df_toInterpolate['result'] = func_interp(df_toInterpolate.index.to_frame().values)
    #return the dataframe with the new values
    return pd.concat([df_input, df_toInterpolate]).sort_index()

Then for example with your df and df_toI = pd.DataFrame(index=pd.MultiIndex.from_tuples([(1.3, 1.7, 1.55),(1.7, 1.4, 1.9)],names=df.index.names)) then you get

print (pd_interpolate_MI(df, df_toI))
              result
a   b   c           
1.0 1.0 1.00    6.00
        2.00    9.00
    2.0 1.00    8.00
        2.00   11.00
1.3 1.7 1.55    9.35
1.7 1.4 1.90   10.20
2.0 1.0 1.00    7.00
        2.00   10.00
    2.0 1.00    9.00
        2.00   12.00
umn
  • 431
  • 6
  • 17
Ben.T
  • 29,160
  • 6
  • 32
  • 54