I have a dataset with 20 variables and 250K rows. I would like to add new variable, "NumAdms", based on the number of rows in "OT_entry" per individual "patient_id". I have constructed a dummy example:
library(dplyr)
reproeg <- names(c("patient_id", "OT_entry", "Other_1", "Other_2",
"Other_3"))
reproeg$patient_id <- c(123, 123, 453, 289, 123)
reproeg$OT_entry <- c("01/01/2012 09:30:00", "20/01/2012 08:20:00",
"02/01/2012 09:40:00", "10/01/2012 11:00:00",
"10/02/2012 09:40:00")
reproeg$Other_1 <- c("xy", "xy", "xy", "zh", "xy")
reproeg$Other_2 <- c(22.3, 33.1, 22.1, 33.5, 44.2)
reproeg$Other_3 <- c(TRUE, FALSE, FALSE, TRUE, FALSE)
reproeg %>%
group_by(patient_id) %>%
mutate(NumAdms, length(OT_entry))
I get the following error message:
Error in UseMethod("group_by_") :
no applicable method for 'group_by_' applied to an object of class "list"