I've been reading how to improve code in R taking a look a some of the answers here and also reading a bit of the R inferno document. Now I have this problem and the loop I created seems to be taking forever (15 hours and counting).
k <- NROW(unique(df$EndStation.Id))
l <- NROW(unique(df$StartStation.Id))
m1 <- as.matrix(df[,c("Duration","StartStation.Id","EndStation.Id")])
g <- function(m){
for (i in 1:l){
for (j in 1:k){
duration <- m[(m[,2]==i & m[,3]==j),1]
if (NROW(duration)<=1) {
m[(m[,2]==i & m[,3]==j),1] <- NA
next
}
duration <- duration/median(duration)
m[(m[,2]==i & m[,3]==j),1] <- duration
}
}
return(m)
}
answer <- g(m1)
The number of Stations (Start and End) is both 750 and the duration vector size can vary a lot from 1 or 2 to 80. Is this loop improbable or should I give up and try to get access to a faster computer.
Best regards, Fernando