I need to subset a data.table based on a condition across multiple columns and then perform an operation on the result.
A minimal example with a simple data.table:
x <- data.table(id=c(1, 2, 3, 4), colour1 = c('red', 'green', 'green', 'blue'),
colour2 = c('yellow', 'red', 'blue', 'black'),
colour3 = c('blue', 'black', 'red', 'yellow'),
score = c(0.7, 0.9, 0.2, 0.35))
I then want to find the maximum score for any row that contains the colour 'yellow':
max_score <- max(x[colour1 == 'yellow' | colour2 == 'yellow' | colour3 == 'yellow']$score)
In this example, the above code works fine but is there a better way of checking this condition across multiple columns? In practice, the table will be much larger and the number of columns can change so I'd like to do this dynamically.