0

I have two datasets "January" and "Samples"

January:

structure(list(Month = c("January-", "January-", "January-", 
"January-", "January-"), long = c(-179.916672, -179.75, -179.583328, 
-179.416672, -179.25), lat = c(39.916668, 39.916668, 39.916668, 
39.916668, 39.916668), npp = c(297.813, 304.971, 292.946, 296.196, 
285.804)), row.names = c(NA, -5L), class = c("tbl_df", "tbl", 
"data.frame"))

Samples:

structure(list(Lat = c(-14.5472718653846, -14.3532975333333, 
-14.2926716206897, -14.2153998571429, -14.0921711666667), Long = 
c(-168.131368846154, 
-170.325961333333, -169.499131724138, -169.060881071429, -169.664168333333
), Sample = c(1, 2, 3, 4, 5)), row.names = c(NA, -5L), class = c("tbl_df", 
"tbl", "data.frame"))

"January" has lat/long and npp (surface productivity) values. This dataset is very large and the latitude spans between -19 to 30 decimal degrees (even though it looks like it has the same latitude in the example).

"Samples" has Lat/Long for the 120 samples I am trying to match the coordinates with January with to get npp values.

I want to use match_nrst_haversine from hutilscpp. This is very similar to this question Match two datasets by minimum geospatial distance (R), however, I only have the Lat and Long in the "Samples" df to match to "January"

This is the code I've tried, but I'm not sure what to use for Index

January[, c("long", "lat", "npp") := match_nrst_haversine(Lat,
                                                          Long,
                                                          npp_lat = npp$Lat,
                                                          npp_lon = npp$Long,
                                                          Index = samp$Lat:Long,
                                                          close_enough = 0,
                                                          cartesian_R = 5)]
Gina
  • 57
  • 6

0 Answers0