I have location data in one data frame (y), and weather data in another data frame (weather).
I want to merge the weather data with the y data frame, but only for times and dates that have a corresponding row in y.
I've tried merge and rbind, and I either get an empty data frame, or one with millions of entries when there should be ~7000
names(y)
[1] "ID" "Year" "Month" "Day" "Time" "Source"
[7] "Source.Lat" "Source.Lon" "Target" "Target.Lat" "Target.Lon"
names(weather)
[1] "Target" "Year" "Month" "Day" "Time"
[6] "Temp" "Dew_Point_Temp" "Humidity" "Wind_Direction" "Wind_Speed"
[11] "Pressure" "Humidex"
all.data <- merge(y, weather, by = c("Target","Year","Month","Day","Time"))
I would like to populate the weather data in y only when Target, Year, Month, Day, and Time match, and disregard the rest.
Sample data (y):
ID Year Month Day Time Target Lat Lon
1 35624 2019 06 19 11:00 Kejimkujik 46.3236 -114.1319
3 35651 2019 06 19 14:00 CNSC 2019 58.7378 -93.8194
5 35620 2019 06 19 14:00 CNSC 2019 58.7378 -93.8194
7 35624 2019 06 20 04:00 CNSC 2019 58.7378 -93.8194
9 35651 2019 06 20 05:00 CNSC 2019 58.7378 -93.8194
Sample data (weather)
Target Year Month Day Time Temp DP Hum WD WS Pressure
1 Kejimkujik 2019 6 1 0:00 6.5 6.1 97 32 3 99.51
2 Kejimkujik 2019 6 1 1:00 5.9 5.6 98 30 2 99.50
3 Kejimkujik 2019 6 1 2:00 4.9 4.7 98 31 3 99.52
4 Kejimkujik 2019 6 1 3:00 4.4 4.3 99 32 3 99.52
5 Kejimkujik 2019 6 1 4:00 4.1 4.0 99 24 3 99.57