Speeding up nested for loop with seq in r

Question

I want to create a new column (df_one$acceslane) with binary values. if df_one$direction == df_two$direction and if df_one$location and df_two$location are almost the same (-> see Distance in the nested for loop) it should be a 1.

df_one:

direction | location | acceslane    
L | 12.3 | NA
R | 14.8 | NA

df_two:

direction | location
L | 12.5 
R | 145.0

for (i in 1:nrow(df_one)) {
  for (j in 1:nrow(df_two)) {
    Distance <- seq(df_two[j, 2]-.5, df_two[j, 2]+.5, by = .1)
    if ((x[i, 1] == df_two[j, 1]) & (x[i, 2] %in% Distance)){
      df_one[i, 3] <- 1
      break}
    else{df_one[i, 3] <- 0}
  }
}

So this code works, but it's not very fast. How can I speed this up?

Do you want the Locations to be close, meaning within 0.5 bounds or do you want them to be equal at the first decimal level? Would (L,12.3) and (L,12.45) be considered as acceslane = 1? — ab90hi, Jan 11 '17 at 14:58
@ab90hi they should match if they are in each other range plus or minus .5 decimal. So if df_one$location = 12.30 and df_two$location = 12.79 they should match. if df_two$location = 12.79 would be 12.81 they shouldnt match. — Arnand, Jan 11 '17 at 15:30
@Armans I didn't get the rolling join at first, so the idea is not a good idea. Cleaning up the comments — Tensibai, Jan 11 '17 at 15:30

score 5 · Accepted Answer · edited May 23 '17 at 12:24

5

Your example doesn't run for me, but I think you are looking to do a rolling join:

library(data.table)

df_one <- fread("direction | location     
             L | 12.3 
             L | 12.7 
             L | 13.1 
             R | 14.8 ", sep = "|")
df_two <- fread("direction | location
             L | 12.5 
             R | 145.0", sep = "|")

df_one[, acceslane := 0]
df_one[df_two, acceslane := 1, on = .(direction, location), roll = 0.5]
df_one[df_two, acceslane := 1, on = .(direction, location), roll = -0.5]
#   direction location acceslane
#1:         L     12.3         1
#2:         L     12.7         1
#3:         L     13.1         0
#4:         R     14.8         0

PS: Never rely on exact comparison of decimal numbers or you will sooner or later ask this FAQ.

edited May 23 '17 at 12:24

Community

1
1

answered Jan 11 '17 at 15:01

Roland

127,288
10
191
288

It doesn't work on my own data. I think it's because I used dplyr and my class of my df_one and df_two are: "tbl_df" "tbl" "data.frame". Do you know how to convert this to "data.table" "data.frame"? – Arnand Jan 11 '17 at 15:56
1

as.data.table should do the job. – Roland Jan 11 '17 at 16:30

Speeding up nested for loop with seq in r

1 Answers1