0

I'm working with a data frame "mydata" containing a variable "therapy" which is a factor (0/1). Another variable is "died" (1 if died, 0 if survived). There are no missing values, so every observation has a value for therapy and died.

Now I would like to alter the value of "therapy" based on the value of "died": If died == 1, the therapy should be set to 0 (so I want do replace the existing value), otherwise the value should stay unchanged.

mydata$therapy <- ifelse(mydata$died == 1,
0,
mydata$therapy)

As a result I get values that not only contain 0 and 1, but also 2 (therapy never contained any "2"). I assume that the increment by one is due to the factor type of "therapy". Also the following code with case_when leads to the same results:

mydata <- mydata %>%
  mutate(therapy = case_when(
  died == 1 ~ 0,
  TRUE ~ therapy))

Does anybody have an idea, what I do wrong? Or does anybody have a solutation for just changing "treatment" to zero if died == 1 and keeping all values as they are if died == 0.

Vinícius Félix
  • 8,448
  • 6
  • 16
  • 32
MDStat
  • 355
  • 2
  • 17
  • 1
    While it's very difficult to know without seeing any data, your `ifelse` code *cannot create `2` spontaneously*, so it's likely something else is wrong that you are attributing to that line of code. I would verify `levels(mydata$therapy)` before and then after that line of code to confirm that it doesn't change (or that it does?). Please make this question more reproducible by adding sample data (`dput(.)`) that includes enough variety that we can witness the phenomenon and suggest changes (we do not need all columns/rows). (See https://stackoverflow.com/q/5963269, thanks.) – r2evans Oct 07 '21 at 14:39
  • Also run unique(mydata$therapy) to verify that therapy "never" contained a 2. Based on your code, you overwrite with the content of your mydata$therapy "correctly". Thus, seeing a 2 means that there are 2's :) or your code has some other magic manipulation elsewhere. – Ray Oct 07 '21 at 14:46
  • 2
    P.S. having said this, please note that factors are "stored internally as numbers". Check out the following: `x = c(0,1); x = factor(x); as.numeric(x)`. This will return `[1] 1 2`, and thus "create" a 2 for a coerced factor value at the 2nd position. In this case, your 1. Check if you do something like this with your code. – Ray Oct 07 '21 at 14:53

1 Answers1

0

Thank you all for your answers! Especially the comment by Ray was helpful - my problem was solved this way:

mydata$therapy <- ifelse(mydata$died == 1,
0,
as.numeric(levels(mydata$therapy[mydata$therapy]))

In the end I got the 0/1 values instead of the 1/2 values because of the factor.

MDStat
  • 355
  • 2
  • 17