1

I want to create a new column (df_one$acceslane) with binary values. if df_one$direction == df_two$direction and if df_one$location and df_two$location are almost the same (-> see Distance in the nested for loop) it should be a 1.

df_one:

direction | location | acceslane    
L | 12.3 | NA
R | 14.8 | NA

df_two:

direction | location
L | 12.5 
R | 145.0

for (i in 1:nrow(df_one)) {
  for (j in 1:nrow(df_two)) {
    Distance <- seq(df_two[j, 2]-.5, df_two[j, 2]+.5, by = .1)
    if ((x[i, 1] == df_two[j, 1]) & (x[i, 2] %in% Distance)){
      df_one[i, 3] <- 1
      break}
    else{df_one[i, 3] <- 0}
  }
}

So this code works, but it's not very fast. How can I speed this up?

nrussell
  • 18,382
  • 4
  • 47
  • 60
Arnand
  • 71
  • 11
  • Do you want the Locations to be close, meaning within 0.5 bounds or do you want them to be equal at the first decimal level? Would (L,12.3) and (L,12.45) be considered as acceslane = 1? – ab90hi Jan 11 '17 at 14:58
  • @ab90hi they should match if they are in each other range plus or minus .5 decimal. So if df_one$location = 12.30 and df_two$location = 12.79 they should match. if df_two$location = 12.79 would be 12.81 they shouldnt match. – Arnand Jan 11 '17 at 15:30
  • @Armans I didn't get the rolling join at first, so the idea is not a good idea. Cleaning up the comments – Tensibai Jan 11 '17 at 15:30

1 Answers1

5

Your example doesn't run for me, but I think you are looking to do a rolling join:

library(data.table)

df_one <- fread("direction | location     
             L | 12.3 
             L | 12.7 
             L | 13.1 
             R | 14.8 ", sep = "|")
df_two <- fread("direction | location
             L | 12.5 
             R | 145.0", sep = "|")

df_one[, acceslane := 0]
df_one[df_two, acceslane := 1, on = .(direction, location), roll = 0.5]
df_one[df_two, acceslane := 1, on = .(direction, location), roll = -0.5]
#   direction location acceslane
#1:         L     12.3         1
#2:         L     12.7         1
#3:         L     13.1         0
#4:         R     14.8         0

PS: Never rely on exact comparison of decimal numbers or you will sooner or later ask this FAQ.

Community
  • 1
  • 1
Roland
  • 127,288
  • 10
  • 191
  • 288
  • It doesn't work on my own data. I think it's because I used dplyr and my class of my df_one and df_two are: "tbl_df" "tbl" "data.frame". Do you know how to convert this to "data.table" "data.frame"? – Arnand Jan 11 '17 at 15:56
  • 1
    as.data.table should do the job. – Roland Jan 11 '17 at 16:30