I have 1 000 000 observations and I need to run a for loop with some condition to create a new variable (Test) in my dataset (T1). As my dataset is very large, the execution is very long. So I try to use foreach
to optimize time.
here is what I'm trying to do, but it is not work well. Any suggestion please?
This is an example of my input:
T1 <- read.table(text="
ID CodeActe Cout test
1 1 356 34.00 NA
2 1 357 8.00 NA
3 1 363 5.75 NA
4 1 9411 150.00 NA
5 2 9411 150.00 NA
6 2 363 5.75 NA", header=T)
and my code:
res <- foreach::foreach(i=1:nrow(T1),.combine = rbind) %dopar% {
if (i+1 > nrow(T1)){
break
}
if (T1$ID[i]==T1$ID[i+1]){
if (T1$CodeActe[i]==356){
T1$test[i]<-1
}
else if (T1$CodeActe[i]==357){
T1$test[i]<-0
}
else if (T1$CodeActe[i]==363){
T1$test[i]<-0
}
else{
T1$test[i]<-T1$CodeActe[i]
}
}}