1

The title says it all, I have a large dataset that consists of factory and latitude and longitude, and among others. some of the factories I find have identical lat long although their name slightly different. How can I merge rows of factories that have the same lat-long in R?

mill latitude longitude ID
a. 12.34. 7.86. NA
A. 12.34. 7.86. 4
b 47.56. 27.07. 5.

The output I am looking for is:

mill latitude longitude ID
a. 12.34. 7.86. 4.
b. 47.56. 27.07. 5
F H
  • 43
  • 5

1 Answers1

1

We can use distinct

library(dplyr)
distinct(df1, latitude, longitude, .keep_all = TRUE)
#    mill latitude longitude
#1   a.   12.34.     7.86.
#2   b.   47.56.    27.07.

With the updated question, an option is to do an arrange first as distinct takes the first unique row

df1 %>%
   arrange(latitude, longitude, is.na(ID)) %>%
   distinct(latitude, longitude, .keep_all = TRUE)
akrun
  • 874,273
  • 37
  • 540
  • 662