71

I am plotting things using facet_wrap and facet_grid in ggplot, like:

ggplot(iris) + geom_histogram(aes(iris$Petal.Width)) + facet_grid(Species ~ .)

Is it possible to control the order in which the Species panels are ordered in the plot? Can this be done without changing the iris dataframe or making a new one? The default here shows setosa, versicolor, virginica but I'd like a different order. thanks.

  • 4
    maybe `facet_grid(factor(Species,levels=c("virginica","setosa","versicolor")) ~ .)` ? [oops, doesn't work] – Ben Bolker Feb 27 '13 at 15:46
  • 2
    As Ben notes, the way to control the ordering of basically everything in ggplot (bars in bar plots, facets, etc.) is to use a factor and adjust the order of the levels. – joran Feb 27 '13 at 15:47
  • ...or this: http://stackoverflow.com/q/3311901/324364 – joran Feb 27 '13 at 15:49
  • 7
    Jumping to duplicates is not helpful - Ben Bolker's answer is much simpler than the one given in the first post you link to. This is clearly a topic of interest and several approaches are useful. –  Feb 27 '13 at 17:07

1 Answers1

73

I don't think I can really satisfy your "without making a new data frame" requirement, but you can create the new data frame on the fly:

ggplot(transform(iris,
      Species=factor(Species,levels=c("virginica","setosa","versicolor")))) + 
    geom_histogram(aes(Petal.Width))+ facet_grid(Species~.)

or, in tidyverse idiom:

iris %>%
   mutate(across(Species, factor, levels=c("virginica","setosa","versicolor"))) %>%
ggplot() + 
   geom_histogram(aes(Petal.Width))+ 
   facet_grid(Species~.)

I agree it would be nice if there were another way to control this, but ggplot is already a pretty powerful (and complicated) engine ...

Note that the order of (1) the rows in the data set is independent of the order of (2) the levels of the factor. #2 is what factor(...,levels=...) changes, and what ggplot looks at to determine the order of the facets. Doing #1 (sorting the rows of the data frame in a specified order) is an interesting challenge. I think I would actually achieve this by doing #2 first, and then using order() or arrange() to sort according to the numeric values of the factor:

neworder <- c("virginica","setosa","versicolor")
library(plyr)  ## or dplyr (transform -> mutate)
iris2 <- arrange(transform(iris,
             Species=factor(Species,levels=neworder)),Species)

I can't immediately see a quick way to do this without changing the order of the factor levels (you could do it and then reset the order of the factor levels accordingly).

In general, functions in R that depend on the order of levels of a categorical variable are based on factor level order, not the order of the rows in the dataset: the answer above applies more generally.

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
  • Could you explain what the `transform` call is doing here precisely? If I type `transform(iris, factor(Species,levels=c("virginica","setosa","versicolor")))` into R, it doesn't output a dataframe with `Species` in the order given by `levels` –  Feb 27 '13 at 17:04
  • I think you left out the `Species=` part of the `transform` call ... `transform` *is* generating a new data frame. – Ben Bolker Feb 27 '13 at 17:08
  • I mistyped it, sorry. If I do `transform(iris, Species=factor(Species,levels=c("virginica","setosa","versicolor")))` in R, it outputs a dataframe that has the order: `setosa, versicolor, virginica` so I don't understand how transform works here. Your full call does produce the graph I wanted but I am confused as to why this `transform` call doesn't give the order specified in `levels`. thanks –  Feb 27 '13 at 17:10
  • 6
    the order of the **rows in the data set** is independent of the order of the **levels of the factor**. The latter is what `ggplot` pays attention to. – Ben Bolker Feb 27 '13 at 20:37