0

I am using a large data.frame of health conditions and outcomes, I wish to combine 10 health conditions into a single condition, If the patient has either a, or b, c, or d, etc. then the condition would be condition one. I am trying to code it like this:

      dataset$one <-  ifelse(dataset, (dataset$a == 1)|
                            (dataset$b == 1)|
                            (dataset$c  == 1)|
                            (dataset$d  == 1),  1, 0)

This seems to work for the first condition, but not when I add conditions. Perhaps R does not allow multiple or statements? Any suggestions?

2 Answers2

3

Assuming that dataset is a data frame, define the column names, cols, and then apply any across each row of dataset[cols] == 1 like this. Add zero to convert the result from logical to numeric:

cols <- c("a", "b", "c", "d")
dataset$one <- apply(dataset[cols] == 1, 1, any) + 0

Notes

  1. If the columns have NA values that you wish to exclude then add the na.rm = TRUE argument:

    dataset$one <- apply(dataset[cols] == 1, 1, any, na.rm = TRUE) + 0
    
  2. The Rfast package has rowAny which could be used if you don't need na.rm:

    library(Rfast)
    dataset$one <- rowAny(dataset[cols] == 1) + 0
    
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
2

We can use Reduce with |

dataset$one <- as.integer(Reduce(`|`, lapply(dataset[c('a', 'b', 'c', 'd')], `==`, 1))

Or another option is rowSums

dataset$one <- as.integer(rowSums(dataset[c('a', 'b', 'c', 'd')] == 1) > 0)
akrun
  • 874,273
  • 37
  • 540
  • 662