10

I have weather data available for about 6 weather stations. For all these stations I have the longitude and latitude available, and also the datetime (every 10 minutes from beginning of 2016 or so). I want to use the kriging interpolation method to fill in missing values at other long/lat locations (between these stations).

I know that scikit-learn has the 'GaussianProcessRegressor' which can be used for kriging. However, I do not understand how I can include the temporal dimensions in the fitting process. Is this even possible or should I fit a separate model for every datetime I have?

X must be an array of features, which in my case would be the latitude and longitude (I think). X is now a list of 6 lat/long pairs (e.g. [52.1093, 5.181]) for every station. I took one date to test the GPR. y is a list of length 6 that contains the dew points for those stations at that specific time.

Now the problem thus is that I actually want to do kriging for all the datetimes. How do I incorporate these datetimes? Should I add the datetimes as features in the X array (e.g. [52.1093, 5.181, 2017, 1, 2, 10, 50])? This looks really weird to me. However, I can't find any other way to also model the temporal factor.

My code for fitting the GaussianProcessRegressor:

    one_date = meteo_df[meteo_df['datetime'] == 
    datetime].drop_duplicates(subset=['long', 'lat'], keep='last')

    long = one_date['long']
    lat = one_date['lat']
    x = [[la,lo] for la, lo in zip(lat, long)]
    y = list(one_date['dew_point']) 

    GPR = GaussianProcessRegressor(n_restarts_optimizer=10)
    GPR.fit(x, y)
LucioRandy
  • 220
  • 1
  • 19
Josh
  • 404
  • 1
  • 5
  • 15

2 Answers2

1

I am assuming that you want out of the box solutions. You have a few options, albeit some feel a bit hacky to me.

  1. Model time as third dimension as done in Graeler et al. 2013 using pykrige 3D Kriging. Be careful to re-scale your time variable to mimic your X, Y coordinates.
  2. Build your Kriging system using space-time variograms of scikit Gstat.
  3. Solve your Kriging system independently for each time period, which is probably your worst option because it ignores the time dependency of your points.

Graeler et al. 2013 describes, compares and expands some of these options in their paper.

VinceP
  • 2,058
  • 2
  • 19
  • 29
-1

See papers by D.E. Myers, S. De Iaco and D. Posa

You need Euclidean coordinates, e.g. UTM for the locations instead of lat/long