155

I am making a dodged bar chart using ggplot with discrete x scale, the x axis are now arranged in alphabetical order, but I need to rearrange it so that it is ordered by the value of the y-axis (i.e., the tallest bar will be positioned on the left).

I tried order or sort, but result in sort the x-axis, but not the bars respectively.

What have I done wrong?

camille
  • 16,432
  • 18
  • 38
  • 60
lokheart
  • 23,743
  • 39
  • 98
  • 169

7 Answers7

213

The best way for me was using vector with categories in order I need as limits parameter to scale_x_discrete. I think it is pretty simple and straightforward solution.

ggplot(mtcars, aes(factor(cyl))) + 
  geom_bar() + 
  scale_x_discrete(limits=c(8,4,6))

enter image description here

Axeman
  • 32,068
  • 8
  • 81
  • 94
Yuriy Petrovskiy
  • 7,888
  • 10
  • 30
  • 34
  • 1
    @HendyIrawan there is no legend unless you've got other dimensions (color, fill) also mapped to the same variable. – Gregor Thomas Oct 08 '15 at 23:10
  • 6
    I think this is the best answer. It controls the order of the x-axis values and doesn't transform or affect the data-frame. The use of `factor` and `reorder` changes characteristics of the data, albeit within the `ggplot()` call, and so does more than it need do for the problem at hand. – mjandrews Nov 22 '18 at 17:38
  • 3
    This should be the accepted answer !! Why complicate things by writing 2 to 3 lines of code for something you can do in a single elegant (predefined) line of code? – SilSur Mar 26 '19 at 12:47
  • 1
    This also worked for me to order x by value of y: `scale_x_discrete(limits = DT$x[order(-DT$y)])+` – armipunk Aug 01 '19 at 19:26
  • If you want to specify the order for all possible levels but don't want unused levels to end up in you plot, check out [my answer below](https://stackoverflow.com/a/66483069/1870254). – jan-glx Mar 04 '21 at 21:16
111

Try manually setting the levels of the factor on the x-axis. For example:

library(ggplot2)
# Automatic levels
ggplot(mtcars, aes(factor(cyl))) + geom_bar()    

ggplot of the cars dataset with factor levels automatically determined

# Manual levels
cyl_table <- table(mtcars$cyl)
cyl_levels <- names(cyl_table)[order(cyl_table)]
mtcars$cyl2 <- factor(mtcars$cyl, levels = cyl_levels)
# Just to be clear, the above line is no different than:
# mtcars$cyl2 <- factor(mtcars$cyl, levels = c("6","4","8"))
# You can manually set the levels in whatever order you please. 
ggplot(mtcars, aes(cyl2)) + geom_bar()

ggplot of the cars dataset with factor levels reordered manually

As James pointed out in his answer, reorder is the idiomatic way of reordering factor levels.

mtcars$cyl3 <- with(mtcars, reorder(cyl, cyl, function(x) -length(x)))
ggplot(mtcars, aes(cyl3)) + geom_bar()

ggplot of the cars dataset with factor levels reordered using the reorder function

Richie Cotton
  • 118,240
  • 47
  • 247
  • 360
44

You can use reorder:

qplot(reorder(factor(cyl),factor(cyl),length),data=mtcars,geom="bar")

Edit:

To have the tallest bar at the left, you have to use a bit of a kludge:

qplot(reorder(factor(cyl),factor(cyl),function(x) length(x)*-1),
   data=mtcars,geom="bar")

I would expect this to also have negative heights, but it doesn't, so it works!

James
  • 65,548
  • 14
  • 155
  • 193
  • 8
    I'm shocked this answer doesn't have more upvotes, 90% of the time this is the proper way to do it. – Gregor Thomas Nov 03 '15 at 02:00
  • 1
    I think both the factor calls are superfluous. There is an implicit call to factor for the first argument and hte second argument is assumed to be numeric. – IRTFM Sep 16 '17 at 16:38
  • An explanation that helped me figure out what these solutions were doing under the hood: https://rstudio-pubs-static.s3.amazonaws.com/7433_4537ea5073dc4162950abb715f513469.html – keithpjolley Sep 06 '18 at 17:47
  • `desc(reorder(.))` can give you the tallest bar on the left. – Rafs Aug 04 '20 at 12:06
40

Hadley has been developing a package called forcats. This package makes the task so much easier. You can exploit fct_infreq() when you want to change the order of x-axis by the frequency of a factor. In the case of the mtcars example in this post, you want to reorder levels of cyl by the frequency of each level. The level which appears most frequently stays on the left side. All you need is the fct_infreq().

library(ggplot2)
library(forcats)

ggplot(mtcars, aes(fct_infreq(factor(cyl)))) +
geom_bar() +
labs(x = "cyl")

If you wanna go the other way around, you can use fct_rev() along with fct_infreq().

ggplot(mtcars, aes(fct_rev(fct_infreq(factor(cyl))))) +
geom_bar() +
labs(x = "cyl") 

enter image description here

jazzurro
  • 23,179
  • 35
  • 66
  • 76
2

I realize this is old, but maybe this function I created is useful to someone out there:

order_axis<-function(data, axis, column)
{
  # for interactivity with ggplot2
  arguments <- as.list(match.call())
  col <- eval(arguments$column, data)
  ax <- eval(arguments$axis, data)

  # evaluated factors
  a<-reorder(with(data, ax), 
             with(data, col))

  #new_data
  df<-cbind.data.frame(data)
  # define new var
  within(df, 
         do.call("<-",list(paste0(as.character(arguments$axis),"_o"), a)))
}

Now, with this function you can interactively plot with ggplot2, like this:

ggplot(order_axis(df, AXIS_X, COLUMN_Y), 
       aes(x = AXIS_X_o, y = COLUMN_Y)) +
        geom_bar(stat = "identity")

As can be seen, the order_axis function creates another dataframe with a new column named the same but with a _oat the end. This new column has levels in ascending order, so ggplot2 automatically plots in that order.

This is somewhat limited (only works for character or factor and numeric combinations of columns and in ascending order) but I still find it very useful for plotting on the go.

eflores89
  • 339
  • 2
  • 10
  • 27
  • I guess I don't see the advantage of this compared to just using `reorder` directly. Doesn't `ggplot(df, aes(x = reorder(AXIS_X, COLUMN_Y), y = COLUMN_Y)) + ...` do the same thing, about as concisely, and without the helper function? – Gregor Thomas Nov 21 '18 at 19:47
1

@Yuriy Petrovskiy's answer is great if you know the levels you want to plot beforehand. If you don't (e.g. because you don't want to plot levels not present in the data), consider using a limit function instead to specify the order:

library(ggplot2)

my_order <- as.character(c(8,3,4,5,6))    # the `as.character` calls are only
ggplot(mtcars, aes(as.character(cyl))) +  # necessary for the OP's example
  geom_bar() +
  scale_x_discrete(limits = function(x) my_order[my_order %in% x])

From the documentation of scale_x_discrete:

limits
      One of:
      - NULL to use the default scale values
      - A character vector that defines possible values of the scale and their order
      - A function that accepts the existing (automatic) values and returns new ones

Otherwise your graph would end up like this (might be preferable):

ggplot(mtcars, aes(as.character(cyl))) +
  geom_bar() +
  scale_x_discrete(limits = my_order)

jan-glx
  • 7,611
  • 2
  • 43
  • 63
1

Another option is to manually set the order along the x-axis using fct_relevel from forcats (part of tidyverse). However, for arranging by frequency, @jazzurro provides the best answer by using fct_infreq (also from forcats).

library(tidyverse)

ggplot(iris, aes(
  x = fct_relevel(Species, 'virginica', 'versicolor', 'setosa'),
  y = Petal.Width)) +
  geom_col() +
  xlab("Species") +
  ylab("Petal Width") +
  theme_bw()

Output

enter image description here

Further, the variable needs to be a factor before using fct_relevel inside ggplot. So, just apply factor to the variable first, then use fct_relevel.

ggplot(mtcars, aes(fct_relevel(factor(cyl), "6", "8", "4"))) +
  geom_bar() +
  labs(x = "cyl")

Output

enter image description here

AndrewGB
  • 16,126
  • 5
  • 18
  • 49