0

I have a dataset containing cities and gps coordinates:

Amsterdam   52.221537   6.893662
Enschede    52.370216   4.895168

And different weather stations;

Schiphol    52.307687   52.307687
Almelo      52.367027   6.668492

What I would like to do now is link these cities with a weather station that is most nearby. So the city of Amsterdam should be linked with Schiphol and Enschede with Almelo.

I assume I have to apply some kind of KNN like algoritm here. Any feedback on a package that I can use to match stations and cities easily?

Frank Gerritsen
  • 185
  • 5
  • 14

2 Answers2

0

this might help or at least help you get started

library(weatherData)
getStationCode("Amsterdam")
[1] "   NEW AMSTERDAM                81058  06 15N  057 18W    2   X                7 GY" "   AMSTERDAM/SCHIPH EHAM        06240  52 19N  004 47E    9   X     T          6 NL"

Weather <- getSummarizedWeather("CYQY", "2015-07-26", "2015-07-28", opt_custom_columns=F)
Weather
        Date Max_TemperatureC Mean_TemperatureC Min_TemperatureC
1 2015-07-26               17                14               11
2 2015-07-27               16                14               12
3 2015-07-28               20                17               14

for more information just look at the manual for the package https://cran.r-project.org/web/packages/weatherData/index.html

MLavoie
  • 9,671
  • 41
  • 36
  • 56
0

There are no need for clustering here. Just calculate the distance between a weather station and each city and select the closest. By simple geometry the distance can be calculated as

sqrt((cityLong - stationLong)^2 + (cityLat - stationLat)^2)

Assuming that you have your data in two dataframes this will get the city for each station

apply(stations,1,function(station){
        distance <- apply(cities,1,function(city){
              (city["long"]-station["long"])^2+(city["lat"]-station["lat"])^2
        })
        cities$name[which.min(distance)]
})
nist
  • 1,706
  • 3
  • 16
  • 24