0

I have 1 000 000 observations and I need to run a for loop with some condition to create a new variable (Test) in my dataset (T1). As my dataset is very large, the execution is very long. So I try to use foreach to optimize time. here is what I'm trying to do, but it is not work well. Any suggestion please?

This is an example of my input:

T1 <- read.table(text="
ID CodeActe   Cout test
1  1      356  34.00   NA
2  1      357   8.00   NA
3  1      363   5.75   NA
4  1     9411 150.00   NA
5  2     9411 150.00   NA
6  2      363   5.75   NA", header=T)

and my code:

res <- foreach::foreach(i=1:nrow(T1),.combine = rbind)  %dopar% {
if (i+1 > nrow(T1)){
  break
}
if (T1$ID[i]==T1$ID[i+1]){

  if (T1$CodeActe[i]==356){
    T1$test[i]<-1
  }
  else if (T1$CodeActe[i]==357){
    T1$test[i]<-0
  }
  else if (T1$CodeActe[i]==363){
    T1$test[i]<-0
  }
  else{
    T1$test[i]<-T1$CodeActe[i]
  }
}}
MrFlick
  • 195,160
  • 17
  • 277
  • 295
Ahmed
  • 1
  • 1
  • 1
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick Oct 08 '19 at 15:57
  • Hi, thank you for your response. Please see my input below – Ahmed Oct 08 '19 at 17:03
  • So what's the deisred output for this input? What happens when `T1$ID[i] != T1$ID[i+1]`? Do you want to keep the NA? – MrFlick Oct 08 '19 at 18:01
  • when T1$ID[i] != T1$ID[i+1] applicate this code: if (T1$ID[i]!=T1$ID[i+1]){ if (T1$CodeActe[i]==363){ T1$test[i]<-0 } else{ T1$test[i]<-T1$CodeActe[i] } }} – Ahmed Oct 08 '19 at 18:07
  • You should have a look at `dplyr` (see https://dplyr.tidyverse.org/articles/dplyr.html). In particular, functions `group_by(...)`, `mutate(...)` and `case_when(...)` will be helpful. It is not parallelized but in my experience, this is not a problem for simple operations up to at least 100M records, running on a laptop. – Pierre Gramme Oct 09 '19 at 10:09

0 Answers0