I am trying to create a new variable that is based on the model number of observations of a separate variable.
Using this df:
help <- data.frame(
id = c(rep(05, times = 8), rep(10, times = 8), rep(12, times = 8)),
episode = c(rep(1, times = 4), rep(2, times =4), rep(3, times = 8), rep(1, times = 4), rep(2, times =4)),
provider = c(rep(70, times = 2), rep(80, times = 2), rep(70, times = 4), rep(30, times = 6), rep(40, times = 2), rep(70, times = 4), rep(10, times = 4)))
I am hoping to create a new variable, provider_mode that is based on the modal provider, or provider with the most observations, per episode.
The end df would look like this:
id episode provider provider_mode
5 1 70 70
5 1 70 70
5 1 80 70
5 1 80 70
5 2 70 70
5 2 70 70
5 2 70 70
5 2 70 70
10 3 30 30
10 3 30 30
10 3 30 30
10 3 30 30
10 3 30 30
10 3 30 30
10 3 40 30
10 3 40 30
12 1 70 70
12 1 70 70
12 1 70 70
12 1 70 70
12 2 10 10
12 2 10 10
12 2 10 10
12 2 10 10
Here is the code I came up with thus far, but it only gives me the count for each provider within each episode. I need to create a mutate command that puts the provider with the most observations, and if there is a tie, select the first provider (e.g., provider 70 within id 5).
help %>% group_by(id, episode, provider) %>% mutate(provider_count = n())