In a recent question i tried to give an answer using dplyr::coalesce
to replace NA
with a grouped median. But I got an
Error: Argument 2 must be an integer vector, not a double vector
error. Trying to figure out what was the cause for this I finally got to point where it looks like the error appears only if nrow(df)
is an un-even number? I somewhat doubtful that this is really the explanation but that's the moment I decided to ask the question here: What is the reason for this? The only related issue i found was here but I'm not sure if this is the same problem?
Edit:
The error is not raised if I replace median
with min
or max
!
MRE:
library(dplyr)
df <- data.frame(ID = 1:7,
Group = c(1, 1, 1, 2, 2, 2, 1),
val1 = c(1, NA, 3, 2, 2, 3, 2),
val2 = c(2, 2, 2, NA, 1, 3, 2))
df %>%
group_by(Group) %>%
mutate_at(vars(-group_cols()), ~coalesce(., median(.,na.rm=TRUE))) %>%
ungroup()
Raises:
Error: Argument 2 must be an integer vector, not a double vector
But if I remove the last row (or the three last rows):
df[1:6, ] %>%
group_by(Group) %>%
mutate_at(vars(-group_cols()), ~coalesce(., median(.,na.rm=TRUE))) %>%
ungroup()
It works....!!?
P.S.
Using ifelse(is.na(.)...
instead of coalesce works also independently of the number of rows:
df %>%
group_by(Group) %>%
mutate_at(vars(-group_cols()), ~ifelse(is.na(.), median(., na.rm = TRUE), .)) %>%
ungroup()
P.P.S The error is also raised when using mean
instead of median