0

I have following dataframe df1, which actually represents grid with coordinates:

     latitude  longitude  level            time
0   40.008606  20.114280  880.0  3/31/1981 5:00
1   40.008606  20.114280  880.0  3/31/1981 6:00
2   40.008606  20.114280  880.0  3/31/1981 7:00
3   40.008606  20.114280  880.0  3/31/1981 8:00
4   39.665283  20.097115  855.0  3/31/1981 5:00
5   39.665283  20.097115  855.0  3/31/1981 6:00
6   39.665283  20.097115  855.0  3/31/1981 7:00
7   39.665283  20.097115  855.0  3/31/1981 8:00
8   39.665283  19.911120  860.0  3/31/1981 5:00
9   39.665283  19.911120  860.0  3/31/1981 6:00
10  39.665283  19.911120  860.0  3/31/1981 7:00
11  39.665283  19.911120  860.0  3/31/1981 8:00

I want to normalize - interpolate 4d weather data to above grid, whereas latitude, longitude, level and time are dimensions. Values in resolution 0.25deg latitude and longitude and in resolution of 25mbar level are in below dataframe df2:

    latitude  level  longitude            time          t
0      40.00  875.0      20.00  3/31/1981 5:00   7.622246
1      40.00  875.0      20.00  3/31/1981 6:00   8.832257
2      40.00  875.0      20.00  3/31/1981 7:00   1.107310
3      40.00  875.0      20.00  3/31/1981 8:00  11.144372
4      40.00  900.0      20.00  3/31/1981 5:00   8.736878
..       ...    ...        ...             ...        ...
66     40.25  900.0      20.25  3/31/1981 8:00   6.014550
67     40.25  850.0      20.25  3/31/1981 5:00   6.729872
68     40.25  850.0      20.25  3/31/1981 6:00   8.098390
69     40.25  850.0      20.25  3/31/1981 7:00   5.234497
70     40.25  850.0      20.25  3/31/1981 8:00   5.968091

Entire dataframe is on this link. So, what I need is column t of dataframe df2 spread-normalized over dataframe df1 in form of new column in df1. Hope desired output is clear.

So far, i am considering this post solution, but It uses same datatype for all dimensions, which is not case here. I managed to find nearest latitude, longitude,level of df2 and add those columns to df1 and then use:

rslt= pd.merge(df1,df2,on=["latitude","level","longitude"],how="left")

but this only gets nearest member, not smooth, interpolated value over above dimensions.

Any help in resolving this is appreciated.

user2727167
  • 428
  • 1
  • 3
  • 16

1 Answers1

1

I suggest you convert your data to an xarray Dataset and then use its multidimensional interpolation capability:

http://xarray.pydata.org/en/stable/user-guide/interpolation.html#multi-dimensional-interpolation

adr
  • 1,731
  • 10
  • 18