I'm trying to reorder my boxplots as it's plotting them in alphabetical order. I'd like to specify the ordering.
I have a dataframe called stride with 10 columns. These include the subject_id, the age_group (young, middle or old) and stride_int.
I used the following code to create the boxplot:
stride %>%
ggplot(aes(x=age_group,y=stride_int)) +
geom_boxplot(outlier.colour = "red", outlier.shape = 21, outlier.fill = "red", outlier.size = 2) +
theme_light() +
labs(title = "Stride interval for different age groups",
y = "stride interval",
x = "age group")
This plots the boxplots in the order the age_groups which is 'middle', 'old' and 'young' so alphabetically.
I would like to order them as 'young', 'middle' and 'old'.
I tried the following:
stride %>%
arrange(age_group) %>%
mutate(age_group = factor(age_group, levels=c("young", "middle", "old"))) %>%
ggplot( aes(x=age_group, y=stride_int)) +
geom_boxplot(outlier.colour = "red", outlier.shape = 21, outlier.fill = "red", outlier.size = 2) +
theme_light() +
labs(title = "Stride interval for different age groups",
y = "stride interval",
x = "age group")
but all it plots just one boxplot. There are no NAs in my dataframe so not sure what's going on.
I've added the dput(head(stride)) and have pasted below. Age_group is already are already characters. I'm not sure what row.names is?
structure(list(time = c(4.0433, 5.1533, 6.1, 9.9633, 11.06, 12.04
), stride_int = c(0.85, 1.11, 0.9467, 1.11, 1.0967, 0.98), subject_id
= c(1, 1, 1, 1, 1, 1), age_months = c(40, 40, 40, 40, 40, 40), gender
= c("M", "M", "M", "M", "M", "M"), height_cm = c(102.87, 102.87,
102.87, 102.87, 102.87, 102.87), weight_kg = c(19.5046720493514,
19.5046720493514, 19.5046720493514, 19.5046720493514,
19.5046720493514, 19.5046720493514), leg_length_cm = c(58.42, 58.42,
58.42, 58.42, 58.42, 58.42), speed_ms = c(1.04289, 1.04289, 1.04289,
1.04289, 1.04289, 1.04289), age_group = c("Young", "Young", "Young",
"Young", "Young", "Young")), row.names = c(NA, -6L), class =
c("tbl_df", "tbl", "data.frame"))
I've also replicated a minmial version of my dataframe below:
time stride_int subject_id gender leg_lenght_cm speed_ms age_group
<dbl> <dbl> <dbl> <chr> <dbl> <dbl> <chr>
4.04 0.85 1 M 58.4 1.04 Young
5.15 1.11 1 M 58.4 1.04 Young
184.60 0.9533 33 F 68.58 1.492 Middle
185.59 0.9900 33 F 68.58 1.492 Middle
186.56 0.970 33 F 68.58 1.492 Middle
64.3600 1.0400 39 F 83.82 1.079 Old
65.3933 1.0333 39 F 83.82 1.079 Old
66.4433 1.0500 39 F 83.82 1.079 Old
477.8933 0.9167 9 F 50.8 1.1377 Young
479.0200 1.1267 9 F 50.8 1.1377 Young
480.3135 1.0883 9 F 50.8 1.1377 Young