0

I have 2 data sets with the exact same coordinates that I am trying to merge into 1 by rearranging the coordinates to match while shifting the data with it. One set has coordinates associated with bioclim variables and another has the same coordinates associated with land ownership, but these coordinates are out of order (thank you GIS for the mixaro). Any tips on how to rearrange it so the coordinates line up for easy dataset manipulation? I was trying to do this in excel but if anyone knows an R trick that would be super! ((Data pic example in the link))

Dataset example

r2evans
  • 141,215
  • 6
  • 77
  • 149
fran25
  • 23
  • 3

1 Answers1

0

You should use R data.table and merge the two tables by latitude and longitude.

library(data.table)
dt1=data.table(df1)
dt2=data.table(df2)
# converting the lat-long columns to characters
dt1[,`:=`(LATITUDE=as.character(LATITUDE), LONGITUDE=as.character(LONGITUDE))]
dt1[dt2,on=c("LATITUDE","LONGITUDE")]
  • performs a left join

OR

dt=merge(dt1,dt2,by=c("LATITUDE","LONGITUDE"),all=T)
  • by takes the column to merge upon
  • all=T performs an outer join
  • Check the documentation of datatable merge for use of other parameters.https://www.rdocumentation.org/packages/data.table/versions/1.13.0/topics/merge
Hasan Bhagat
  • 395
  • 1
  • 8
  • Unfortunately, this is unlikely to work reliably. True "equality" of floating point numbers works some of the time but is pretty much *guaranteed* to not work all of the time, see https://stackoverflow.com/q/9508518/3358272 for good examples of why. (The way to merge/join floating-point is over ranges, using a "non-equi join". This is supported by SQL (databases) and `data.table` natively, and `dbplyr` in the tidyverse when working on remote data using `sql_on`, but not `dplyr`-native.) – r2evans Sep 09 '20 at 05:28
  • 1
    Ahh, I see. However, if the columns are first type casted to character, floating point issue will not arise, right?. – Hasan Bhagat Sep 09 '20 at 05:32
  • 1
    However, since there is an expectation that two very-close-numbers are the same, it might work to `format(lat,digits=22)` (same for other three columns) and do a straight merge on string-equality (which is deterministic/predictable). – r2evans Sep 09 '20 at 05:33