I have a dataset of the type 900,000 x 500, but the following shows a toy dataset apt for the question.
library(data.table)
df1 <- data.table(x = c(1,2,4,0), y = c(0,0,10,15), z = c(1,1,1,0))
I would like to do the following:
- For columns y and z
- select rows the value for which = 0
- replace these with the max+1, where max is computed over the entire column
I am very new to data.table. Looking at examples of questions here at stackoverflow, I couldn't find a similar question, except this: How to replace NA values in a table *for selected columns*? data.frame, data.table
My own attempt is as follows, but this does not work:
for (col in c("x", "y")) df1[(get(col)) == 0, (col) := max(col) + 1)
Obviously, I haven't gotten accustomed to data.table
, so I'm banging my head against the wall at the moment...
If anybody could provide a dplyr
solution in addition to data.table
, I would be thankful.