I tried to create a function that would return me x largest MOLECULES based on how many unique PATIENT_ID each of them has, in descending order. That from a certain date until the last.
data <- data.frame(PATIENT_ID = c(1,1,2,2), dateM = c(ymd("2020-01-05","2020-01-06","2020-05-06","2019-12-15")), MOLECULES = c("mol1", "mol1", "mol1", "mol2"))
topx <- function(data, datefrom, var , x = 5){
data %>%
subset(dateM >= datefrom) %>%
group_by(var) %>%
summarize(pat = length(unique(PATIENT_ID))) %>%
arrange(-pat) %>%
head(x) %>%
select(1)
}
topx(data = data, datefrom = "2016-04", var = MOLECULES, x = 2)
The wanted result in this case would be would be:
c("mol1","mol2")
However, it takes var as text and doesnt parse the MOLECULES in and tells me that.
Error: Must group by variables found in `.data`.
* Column `var` is not found.