I have a df with date_entered and person_id. I first cut the month from the date_entered using
df$month <- as.Date(cut(df$date_entered, breaks = "month"))
then created a df of frequency by person_id using
occurences<-df %>%
count(month, person_id)
where month is month, person_id, and n is the count per month for that person_id
| month | person_id | n |
| ---------- | ----------|----|
| 2021-01-01 | 12345652 | 2 |
| 2021-01-01 | 56412342 | 6 |
| 2021-01-01 | 45621311 | 11 |
| 2021-01-01 | 45213652 | 8 |
| 2021-01-01 | 69534000 | 1 |
| 2021-01-01 | 60221351 | 4 |
| 2021-02-01 | 12345652 | 8 |
| 2021-02-01 | 12342546 | 6 |
| 2021-02-01 | 52013000 | 3 |
| 2021-02-01 | 33251000 | 1 |
| 2021-02-01 | 55210000 | 6 |
| 2021-02-01 | 10012310 | 4 |
| 2021-03-01 | 00012342 | 2 |
I played around with various codes, including
count_n <- occurences$n
a_number <- occurences$person_id
occurences_df <- data.frame(occurences$month, occurences$person_id, count_n)
ggplot(occurences[tail(order(occurences$count_n),20),],) +
aes(x=reorder(person_id, -count_n), count_n) +
geom_bar(stat = "identity") +
labs(x="top 20", y ="number of days in QA") +
theme(axis.text.x = element_blank())
so far with the ggplot above, (using my original dataset) I am able to create the plot below but without the grouping by month:
each bar above refers to a unique person_id and the height is the number of times it occurred. However, I would like to show the top 5 per month based on the date_entered variable or the month variable created from the occurrences table.
I would like to see something like this:
instead of the week number on the x-axis, it refers to the top 5 person_id per month