I have a data set with many rows and the following columns: an id
column, a set of columns showing the result of one round of measurements for several values (val1.x
, val2.x
, val3.x
, ...) and another set of columns showing the result of another round of measurements for the same values (val1.y
, val2.y
, val3.y
, ...). Here is a reduced working example:
d <- data.table(
id = 1:10,
val1.x = c(1, 0, 0, 1, 0, 1, 0, 0, 1, 0),
val2.x = c(1, 0, 1, 1, 0, 0, 0, 0, 0, 0),
val1.y = c(0, 0, 0, 1, 0, NA, NA, 0, 1, 0),
val2.y = c(1, 0, 0, NA, 0, 1, 0, 0, 1, 0)
)
My goal is to get to a data set which lists the same columns, along with the maximum of the two measurements for each value. Here is the desired output for above example
id val1.x val2.x val1.y val2.y val1.max val2.max
1: 1 1 1 0 1 1 1
2: 2 0 0 0 0 0 0
3: 3 0 1 0 0 0 1
4: 4 1 1 1 NA 1 1
5: 5 0 0 0 0 0 0
6: 6 1 0 NA 1 1 1
7: 7 0 0 NA 0 0 0
8: 8 0 0 0 0 0 0
9: 9 1 0 1 1 1 1
10: 10 0 0 0 0 0 0
As evident from the example, by max I mean max(..., na.rm = T)
. I also have a variable cols
already prepare with this value:
cols <- c('val1', 'val2')
The Goal
I want to use this variable to loop through the columns dynamically and calculate the max.
What is a good dplyr
way of achieving this?
What is a good data.table
way of achieving this?
NB: I do not want to use the order of columns (so solutions that reference columns by their order (e.g. 2:3
) are not desirable. The input may change and additional columns may be added to the left of the values, so I need to use the names of the columns to do the calculations. The id
column will always be unique per row.
What I have tried so far
I can get the right-hand side of the equation to be dynamic using as.symbol
like this:
d[, .(val1.max := pmax(eval(as.symbol('val1.x')), eval(as.symbol('val2.x'))))]
But I cannot get the left-hand side to become dynamic.
I also tried to implement a solution based on this SO question but it gives me an error:
left <- "va1.x"
right <- "va1.y"
new <- "val1.max"
expr <- bquote(.(as.name(new)):=pmax(as.name(left), as.name(right), na.rm=T))
d[, eval(expr)]