1

I have a data frame that contains a column for Latitude and a column for Longitude that looks like this

test <- data.frame("Latitude" = c(45.14565, 45.14565, 45.14565, 45.14565, 33.2222, 
31.22122, 31.22122), "Longitude" = c(-105.6666, -105.6666, -105.6666, -104.3333, 
-104.3333, -105.77777, -105.77777))

I would like to make every value go out to 5 decimal places and also check to see that if a latitude and longitude pair are the same as the pair above it, to add 0.00001 to both the latitude and the longitude value. So my data would change to this:

test_updated <- data.frame("Latitude" = c(45.14565, 45.14566, 45.14567, 45.14565, 
33.22220, 31.22122, 31.22123), "Longitude" = c(-105.66660, -105.66661, -105.66662, 
-104.33330, -104.33330, -105.77777, -105.77778))
Uwe
  • 41,420
  • 11
  • 90
  • 134
Sarah
  • 411
  • 4
  • 14
  • 1
    Related: [Increment by one to each duplicate value in R](https://stackoverflow.com/questions/43196718/increment-by-one-to-each-duplicate-value-in-r) – Henrik Aug 03 '21 at 16:03

3 Answers3

1

Here is an approach which updates the Latitude column in test to reproduce OP's expected result:

options(digits = 8) # required to print all significant digits of Longitude
library(data.table)
setDT(test)[, `:=`(Latitude  = Latitude  + (seq(.N) - 1) * 0.00001,
                   Longitude = Longitude + (seq(.N) - 1) * 0.00001), 
            by = .(Latitude, Longitude)]
test
   Latitude  Longitude
1: 45.14565 -105.66660
2: 45.14566 -105.66659
3: 45.14567 -105.66658
4: 45.14565 -104.33330
5: 33.22220 -104.33330
6: 31.22122 -105.77777
7: 31.22123 -105.77776

For comparison

test_updated
  Latitude  Longitude
1 45.14565 -105.66660
2 45.14566 -105.66661
3 45.14567 -105.66662
4 45.14565 -104.33330
5 33.22220 -104.33330
6 31.22122 -105.77777
7 31.22123 -105.77778

The discrepancy is caused by OP's requirement to add 0.00001 to both the latitude and the longitude value and OP's expected result where 0.00001 have been subtracted from the negative longitude values.

Edit

In order to reproduce the expected result, the sign of the value has to be considered. Unfortunately the base R sign() function returns zero for sign(0). So, we use fifelse(x < 0, -1, 1) instead.

In addition, we can pick up Henrik's splendid idea to use the rowid() function to avoid grouping.

options(digits = 8) # required to print all significant digits of Longitude
library(data.table)
cols <- c("Latitude", "Longitude")
setDT(test)[, (cols) := lapply(.SD, \(x) x + fifelse(x < 0, -1, 1) * 
                                 (rowidv(.SD, cols) - 1) * 0.00001), .SDcols = cols]
test
   Latitude  Longitude
1: 45.14565 -105.66660
2: 45.14566 -105.66661
3: 45.14567 -105.66662
4: 45.14565 -104.33330
5: 33.22220 -104.33330
6: 31.22122 -105.77777
7: 31.22123 -105.77778
Uwe
  • 41,420
  • 11
  • 90
  • 134
0

As usual there is no need to use a loop:

library(dplyr)
test_updated = test %>% 
    mutate(
        across(c(Latitude, Longitutde), 
            function(x) if_else(x == lag(x), x+0.00001, x)
            )
        )

format(round(test_updated, 5), nsmall = 5)
  Latitude Longitutde
1 45.14566 -105.66659
2 45.14566 -105.66659
3 45.14566 -105.66659
4 45.14566 -104.33329
5 33.22221 -104.33329
6 31.22123 -105.77776
7 31.22123 -105.77776
user438383
  • 5,716
  • 8
  • 28
  • 43
-1

Not sure if I understand you correctly, but maybe something like this?

rm(list=ls())

n <- nrow(test)
test_updated <- data.frame(Latitude = double(n),
                       Longitude = double(n))

add <- 0.00001
test_updated[1,] <- test[1,]
for (i in 2:nrow(test)){
  if(test$Latitude[i-1] == test$Latitude[i] & test$Longitutde[i-1] == test$Longitutde[i]){
    test_updated$Latitude[i] <- test$Latitude[i] + add
    test_updated$Longitude[i] <- test$Longitutde[i] + add
   } else{
     test_updated[i,] <- test[i,]
  }
}
BillyBouw
  • 314
  • 2
  • 10