1

I have been using ggplot2 with geom_boxplot to plot multiple boxplots in one graph. The data looks something like thst below.

Month   Rainfall

1         45
1         12
1         14
2         65
2         45
2         78
3         10
3         35
3         92
.         .
.         .
.         .

So by using boxplot I want to see the boxplot for the values of Rainfall for each group (1,2,3...). The result I am getting is weird and order seems messed up. Any help?

ggplot(data=edit3)+geom_boxplot(aes(x=Month, y=Rainfall))

Note: edit3 is the dataframe with the values of Rainfall and Month.

enter image description here

dput(head(edit3[,c("Month","Rainfall")],9))
structure(list(Month = c("1", "1", "1", "1", "1", "1", "1", "1", 
"2"), Rainfall = c(NA, 135.6, 34.2, 39.4, 134.6, 234.6, 69.6, 
92.8, NA)), row.names = c(NA, -9L), class = c("tbl_df", "tbl", 
"data.frame"))
bbiasi
  • 1,549
  • 2
  • 15
  • 31
akis
  • 67
  • 5
  • 4
    The Months are ordered lexicographically, most likely because that column is a character or a factor. Convert it to a numeric column. – joran May 30 '19 at 20:32
  • 1
    Not sure what the best duplicate would be, maybe [this](https://stackoverflow.com/q/38413264/324364) or, possibly [this](https://stackoverflow.com/q/3418128/324364) – joran May 30 '19 at 20:39
  • I just tried this using `edit3$Month <- as.integer(edit3$Month)` and the result is only one boxplot of the total values of Rainfall. Any suggestion? – akis May 30 '19 at 20:39
  • Thanks, I will check both. Could not find a duplicate. – akis May 30 '19 at 20:40
  • 2
    Not without an actual reproducible example to work with. But I can tell you with 100% certainty that the ordering you're seeing is because that column was originally either a character or factor. If it was a factor, then `as.integer` won't necessarily convert it the way you want. Se my second link. – joran May 30 '19 at 20:40
  • Following up on joran's last comment, if your variable is a factor you need to convert it to text before converting to numeric. Try `edit3$Month = as.integer(as.character(edit3$Month))`. – user2363777 May 30 '19 at 20:42
  • I see that `typeof(edit3)` is list. I have been working the data with dplyr and tidyr, not sure why list. But `typeof(edit3$Month)` is integer. – akis May 30 '19 at 20:48
  • Can you edit your question and add the output of ```dput(head(edit3[,c("Month","Rainfall")],9))```? – M-- May 30 '19 at 20:50
  • Huh, I thought for some reason ggplot would do multiple boxplots for an integer x axis, but clearly it does not. You have to make it a factor, but specify the order of the levels. – joran May 30 '19 at 20:54
  • 3
    Ah, there you go, you can do with integer x axis, you just also have to specify `group = Month`. – joran May 30 '19 at 20:56
  • `edit3`'s type is a list because data frames are lists—they're lists of columns – camille May 31 '19 at 00:21
  • Possible duplicate of [Numeric axis labels in incorrect order](https://stackoverflow.com/questions/38413264/numeric-axis-labels-in-incorrect-order) – camille May 31 '19 at 00:23

1 Answers1

1

Because your months are like factors, you just need to reorder the factors. Here I used the forcats package for this.

library(dplyr)
library(forcats)

edit31_1 <- edit3 %>% 
  dplyr::mutate(Month = forcats::fct_inorder(Month))

ggplot2::ggplot(edit31_1) +
  geom_boxplot(aes(x = Month, y = Rainfall))

enter image description here


  • Fictitious data used:
library(ggplot2)

set.seed(1)
edit3 <- data.frame(Month = as.factor(rep(paste(seq(1, 12, 1)), 3)),
                    Rainfall = rnorm(n = 36, mean = 60, sd = 30))

ggplot2::ggplot(edit3) +
  geom_boxplot(aes(x = Month, y = Rainfall))

enter image description here

bbiasi
  • 1,549
  • 2
  • 15
  • 31