I'm summarising the data and I get different median values in the table created with dplyr package and boxplot (ggplot2). The sample data can be found here :
dplyr table
library(dplyr)
library(ggplot2)
sample2 = read.csv("sample2.csv")
sample2 %>%
group_by(category) %>%
summarise(median_avg=median(avg_value), median_total = (median(total_value)))
the result is 307 for 3+ category
# A tibble: 3 × 3
category median_avg median_total
<chr> <dbl> <dbl>
1 1 17.500 37.07
2 2 16.830 117.48
3 3+ 17.375 306.95
However, when I try to visualise it in boxplot, I get different median for 3+ category, below 200:
boxplot
sample2 %>%
ggplot(aes(category, total_value)) + geom_boxplot() +
scale_y_continuous(limits = c(0,500))
I tried this using dummy data and there's no discrepancy between the table and the boxplot, any ideas what causes the problems in this particular dataset? Thanks for your help! Any ideas