I have the following dataframe (where both are factor variables):
A B
1 pine NA
2 fig 234
3 fig 234
4 fig 145
5 pine 123
6 fig NA
I'm wanting to replace missing values in B with the most frequently occurring value within group A in the dataset. The above would become:
A B
1 pine 123
2 fig 234
3 fig 234
4 fig 145
5 pine 123
6 fig 234
I've found the code below on a similar question which makes use of the which.max function, but can't seem to get it work group wise, with it returning the overall max of column B for every NA value, instead of the max within each group of A.
df2 <- df1 %>%
group_by(A) %>%
add_count(B) %>%
mutate(B = if_else(is.na(B), B[which.max(n)], B)) %>%
select(-n) %>%
ungroup()
Do I need an extra group by somewhere?