0

My dataset (called data_set_C looks like this:

treatment  value
1 before 554.481
2 before 965.207
3 before 759.844
4 before 1252.716
5 during 1161.710
6 during 1252.716
7 during 1026.816
8 during 1031.018
9 during 972.932
10 during 914.847
11 during 1004.413
12 during 1074.582
13 during 972.932
14 during 975.475
15 during 1466.659
16 during 1550.493
17 during 1314.325
18 during 1408.573
19 during 1263.360
20 during 1248.838
21 during 1322.136
22 during 1306.924
23 during 1248.838
24 during 1263.360
25 after 944.671
26 after 929.368
27 after 1001.975
28 after 975.475
29 after 954.939
30 after 985.744

But when I graph a geom_boxplot the order is not in Before, During, and After like I want it to be. It's in After, Before, During. When I try to reorder using levels() or re-order() I can't quite seem to get it to work. I simply want it in the order of Before, During, and After.

ggplot(data_set_C, aes(x=as.factor(treatment),  y = value, fill = treatment)) +
  geom_boxplot()
stefan
  • 90,330
  • 6
  • 25
  • 51

1 Answers1

0

You could achieve your desired result by converting to a factor with the levels set in your desired order:

library(ggplot2)

data_set_C$treatment <- factor(data_set_C$treatment, levels = c("before", "during", "after"))

ggplot(data_set_C, aes(x = treatment, y = value, fill = treatment)) +
  geom_boxplot()

DATA

data_set_C <- structure(list(treatment = structure(c(
  1L, 1L, 1L, 1L, 2L, 2L,
  2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
  2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L
), levels = c(
  "before", "during",
  "after"
), class = "factor"), value = c(
  554.481, 965.207, 759.844,
  1252.716, 1161.71, 1252.716, 1026.816, 1031.018, 972.932, 914.847,
  1004.413, 1074.582, 972.932, 975.475, 1466.659, 1550.493, 1314.325,
  1408.573, 1263.36, 1248.838, 1322.136, 1306.924, 1248.838, 1263.36,
  944.671, 929.368, 1001.975, 975.475, 954.939, 985.744
)), row.names = c(
  "1",
  "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
  "14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24",
  "25", "26", "27", "28", "29", "30"
), class = "data.frame")

stefan
  • 90,330
  • 6
  • 25
  • 51
  • Thanks for your response! I am super new to coding and r so I don't entirely understand the structure of your data. My data.frame was created using rbind like this: – Astarte Aug 12 '22 at 21:07
  • Before <- data.frame(treatment = rep("before", 4), value = c(554.481, 965.207, 759.844, 1252.716)) During <- data.frame(treatment = rep("during", 20), value = c(1161.71, 1252.716, 1026.816, 1031.018, 972.932, 914.847, 1004.413, 1074.582, 972.932, 975.475, 1466.659, 1550.493, 1314.325, 1408.573, 1263.36, 1248.838, 1322.136, 1306.924, 1248.838, 1263.36)) After <- data.frame(treatment = rep("after", 20), value= c(944.671, 929.368, 1001.975, 975.475, 954.939, 985.744)) data_set_C = rbind(Before, During, After) – Astarte Aug 12 '22 at 21:08
  • I tried using the levels function already exactly as you did but it did not work. My guess is that it's related to how my data.frame is setup. – Astarte Aug 12 '22 at 21:09
  • Hm. First: Concerning the way I provided the data. That's the output of applying the `dput()` function on the example data you provided. `dput()` is a way to share data in a reproducible fashion. For more on this see [a minimal reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). Or you could try yourself with `dput(data_set_C)`. But using `data.frame` as you did in the second comment is in general also fine. – stefan Aug 12 '22 at 21:16
  • Concerning why the code did not work I can only guess what went wrong. To this end I would recommend to (sooner or later) have a look at the `reprex` package which is a way to share code, the results from code including errors in a reproducible fashion, i.e. doing so makes sure that others can reproduce your issue. – stefan Aug 12 '22 at 21:18
  • I figured out why it wasn't working! Thanks for your help (: – Astarte Aug 12 '22 at 22:56