0

I am trying to create a new variable called "txtype" (treatment type) based on a variable containing codes for different treatments "NDC". In this "txtype" variable, I want to create multiple levels indicating the actual treatment type.

So far, I only know how to create a higher level variable for the actual treatment type:

 data$typeA <- data, NDC %in% c("11111", "22222", "33333"))

But what I want to do is to create data$txtype, where txtype has LEVELS from typeA up to typeG. For example, in this new variable txtype, level typeA has NDC of either 11111, 22222, 33333; typeB has NDC of either 44444, 55555, and so on, up to 7 types.

I apologize in advance for this basic question and if something similar has been posted- I would appreciate it if you could point me in the right direction!

Edit: I am so sorry this edit is late. case_when was elegant but did not do what I was looking for! I am trying to create ONE variable called "txtype" with multiple LEVELS named "typeA", "typeB", etc. Below are two columns from sample data including subject ID and variable "NDC". The third column is what I am hoping to create, based on NDC values.

ID     NDC     txtype
1      11111   typeA
1      44444   typeB
2      22222   typeA
2      33333   typeA
2      55555   typeC
divibisan
  • 11,659
  • 11
  • 40
  • 58
sh2
  • 45
  • 9
  • I think I disagree with @CalumYou, but your question needs more details. Please make this question reproducible by adding sample data. (I suspect this can be solved by using a lookup `data.frame` and then using `merge()`, but that waits to be seen.) – r2evans Apr 04 '18 at 19:25

1 Answers1

3

Look into dplyr::case_when. It allows you to specify a vectorised if. So this would be something like:

library(dplyr)
data %>%
    mutate(
        txtype = case_when(
            NDC %in% c("11111", "22222", "33333") ~ "typeA",
            NDC %in% c("44444", "55555") ~ "typeB"
        )
    )
Calum You
  • 14,687
  • 4
  • 23
  • 42