0

I am currently changing from Stata to R and would appreciate some help with the following problem.

I am analyzing different treatment duration (in months) and am trying to generate a dummy variable for each month where 0=no treatment and 1=treatment.

I have variables with the total number of months in treatment for each treatment episode (dur_t) and variables for the time between treatment episodes (dur_nt). Here are some random data that looks sort of what I have (I can't share mine).

set.seed(10000)
id <- 1:100 
dur_t1 <- round(runif(n = 100, min = 1, max = 12),0)
dur_nt1 <- round(runif(n = 100, min = 1, max = 12),0)
dur_t2 <- round(runif(n = 100, min = 1, max = 12),0)
df <- data.frame(id,dur_t1,dur_nt1,dur_t2)
df$dur_nt1 <- na_if(df$dur_nt1, 7)
df$dur_nt1 <- na_if(df$dur_nt1, 3)
df$dur_t2 <- na_if(df$dur_t2, 11)
df$dur_t2 <- na_if(df$dur_t2, 5)
df$dur_t2[is.na(df$dur_nt1)] <- NA 

So my data looks something like this:

id dur_t1 dur_nt1 dur_t2
1 1 0 5
2 3 3 2
3 1 NA NA
4 2 2 2
5 5 2 1

And I would like to have something like this:

id dur_t1 dur_nt1 dur_t2 month1 month2 month3 month4 month5 month6 month7 month8
1 1 0 5 1 1 1 1 1 1 NA NA
2 3 3 2 1 1 1 0 0 0 1 1
3 1 NA NA 1 NA NA NA NA NA NA NA
4 2 2 2 1 1 0 0 1 1 NA NA
5 5 2 1 1 1 1 1 1 0 0 1

As you can see in the table:

  • First row finished their first treatment episode within 1 month, therefore the variable month1=1. The individual started a new treatment during the same month, thus the duration of no treatment variable (dur_nt1) equals 0 and no replacement is done. Later, the case started a second treatment for 5 months (dur_t2=5), so month2-month6 are replaced with a 1 in each column. Finally, month7 (onward) should be "NA" for that case.
  • Second row dur_t1=3, therefore month1-month3 are coded 1. The no treatment variable (dur_nt1) equals 3, therefore month4-month6 are coded 0, lastly dur_t2=2 and month7-month8 are coded as 1.
  • Third row just had 1 treatment episode for 1 month (dur_t1=1). Thus after that it has only NA en each month.

And so on. I have 85000 observations in my dataset.

Thanks in advance!!

idborquez
  • 33
  • 4
  • It's easier to help if you share data in a [reproducible format](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). Please [do not post code or data in images](https://meta.stackoverflow.com/q/285551/2372064) otherwise we have to retype everything just to test possible solutions. But something like `rep` could help. This would generate the data for the first row: `rep(c(1,0,1), c(1,5,6))` – MrFlick Oct 28 '22 at 18:00
  • Hi, thanks for the reply and sorry about not posting any code, here is some you can use that looks more like my data `set.seed(10000) id <- 1:100 dur_t1 <- round(runif(n = 100, min = 1, max = 12),0) dur_nt1 <- round(runif(n = 100, min = 1, max = 12),0) dur_t2 <- round(runif(n = 100, min = 1, max = 12),0) df <- data.frame(id,dur_t1,dur_nt1,dur_t2)` – idborquez Oct 28 '22 at 18:59
  • Well, that random data won't always add up to 12 months so it's not clear what the output should look like in that case. Make sure it edit your question with code rather than post code in comments so it can be properly formatted. – MrFlick Oct 28 '22 at 19:23
  • If you are using `dplyr+tidyr`, then this should work: `df %>% rowwise() %>% mutate(months=list(rep(c(1,0,1), c(dur_t1,dur_nt1, dur_t2)))) %>% unnest_wider(months, names_sep="")` – MrFlick Oct 28 '22 at 19:35
  • Thanks! I just edited the post and will be trying this solution. New to the community, hope the new information provided helps. – idborquez Oct 28 '22 at 19:57
  • Hi, the code didn't work. I am getting this error: Error in `mutate()`: ! Problem while computing `months = list(rep(c(1, 0, 1), c(dur_t1, dur_nt1, dur_t2)))`. i The error occurred in row 1. Caused by error in `rep()`: ! invalid 'times' argument Backtrace: 1. ... %>% unnest_wider(months, names_sep = "") 5. dplyr:::mutate.data.frame(., months = list(rep(c(1, 0, 1), c(dur_t1, dur_nt1, dur_t2)))) 6. dplyr:::mutate_cols(.data, dplyr_quosures(...), caller_env = caller_env()) 8. mask$eval_all_mutate(quo). Thanks for the help – idborquez Oct 28 '22 at 22:45

0 Answers0