55

I've been battling to order and plot a simple dataframe as a bar chart in ggplot2.

I want to plot the data as it appears, so that the values ('count' variable) for the corresponding categories (e.g. 'humans', 'male') are plotted from high to low.

I've followed other threads on this site asking similar questions, but can't get this to work!

## Dataset (mesh2)

#Category                   Count

#Humans             62

#Male               40

#Female             38

#Adult              37

#Middle Aged            30

#Liver/anatomy & histology          29

#Organ Size                 29

#Adolescent                 28

#Child              21

#Liver/radiography*             20

#Liver Transplantation*     20

#Tomography, X-Ray Computed         20

#Body Weight            18

#Child, Preschool               18

#Living Donors*         18

#Infant             16

#Aged               14

#Body Surface Area              14

#Regression Analysis        11

#Hepatectomy            10

## read in data (mesh2) as object (mesh2)

mesh2 <- read.csv("mesh2.csv", header = T)

## order data by count of mesh variable

mesh2$cat2 <- order(mesh2$Category, mesh2$Count, decreasing=TRUE)

## Barplot created in ggplot2

library(ggplot2)

mesh2p <- ggplot(mesh2, aes(x=cat2, y=Count)) + geom_bar (stat="identity") +     scale_x_continuous(breaks=c(1:20), labels=c("Humans", "Male", "Female", "Adult", "MAged",   "Liver anat & hist", "Organ Size", "Adolescent",   "Child", "Liver radiog", "Liver Transplnt", "Tomog X-Ray Computed", "Body Weight", "Child Preschool", "Living Donors", "Infant", "Aged", "BSA", "Regression Analysis", "Hepatectomy"))+ theme (axis.text.x=element_text(angle=45, hjust=1))
double-beep
  • 5,031
  • 17
  • 33
  • 41
Ben G Small
  • 585
  • 1
  • 5
  • 12
  • 1
    Make your `Category` an ordered factor. See `?factor` to learn how to do that. – Roland Jun 06 '13 at 12:28
  • 1
    @Roland No, that is *not* how to do this in general. What is the implied ordering in the set `c("human","male","female","cat")`? An ordered factor is for data where the levels *themselves* convey some quantitative information, e.g the set `c("wet","moist","dry")`. Why your wrong about this is storing these data as an ordered factor will do the wrong thing if used in a model in R (polynomial constraints) for unordered data. What is wanted is the `reorder()` function. – Gavin Simpson Jun 06 '13 at 17:48
  • 1
    s/your/you're/ fingers... – Gavin Simpson Jun 06 '13 at 21:12

1 Answers1

185

You want reorder(). Here is an example with dummy data

set.seed(42)
df <- data.frame(Category = sample(LETTERS), Count = rpois(26, 6))

require("ggplot2")

p1 <- ggplot(df, aes(x = Category, y = Count)) +
         geom_bar(stat = "identity")

p2 <- ggplot(df, aes(x = reorder(Category, -Count), y = Count)) +
         geom_bar(stat = "identity")

require("gridExtra")
grid.arrange(arrangeGrob(p1, p2))

Giving:

enter image description here

Use reorder(Category, Count) to have Category ordered from low-high.

Gavin Simpson
  • 170,508
  • 25
  • 396
  • 453
  • 1
    Hi Gavin. Thanks v.much for this. I just logged on, having spent the last hour trying to figure a solution and also came up with the solution you present! Nevertheless, your explanation about levels either conveying quantitative information or not and the appropriate use of either the factor or reorder function is really helpful. Thanks again. – Ben G Small Jun 06 '13 at 20:51
  • I cannot figure out why this does not work if you have more than one fills, per category in this case. – JHo Mar 17 '16 at 18:47
  • I think your code will not work with negative values, please see http://dpaste.com/0Y5T182 In this case, your plot will not be in descending order after plotting. Code here too `dat.m <- structure(list(Date = c("1.5.2017", "1.3.2017", "1.5.2017", "1.3.2017"), variable = structure(c(1L, 1L, 2L, 2L), .Label = c("Total", "Area"), class = "factor"), value = c(110, -90, 700, 880)), row.names = c(NA, -4L), .Names = c("Date", "variable", "value"), class = "data.frame"); library(ggplot2); ggplot(dat.m, aes(x = reorder(Date, -value), y = value, fill=variable)) + geom_bar(stat='identity')`. – Léo Léopold Hertz 준영 May 25 '17 at 15:21
  • 1
    @LéoLéopoldHertz준영 That is not correct; check it yourself using the code in my example replacing `df` with `df <- data.frame(Category = sample(LETTERS), Count = rnorm(26))`. Your example fails because your `value` can't order the levels uniquely, there are in the example, two values of `value` per level of `Date`. – Gavin Simpson May 25 '17 at 16:31
  • I don't have negative values, and yet the code won't work. It still shows some unordered bars. This happens on a facet_wrap chart. – Carrol Mar 26 '18 at 13:09
  • @Mel what is not ordered? The facets or the bars? – Gavin Simpson Mar 26 '18 at 15:20
  • @GavinSimpson the bars... the appear mixed, with some facets correctly ordered and others not ordered. – Carrol Mar 26 '18 at 15:27
  • 1
    @Mel ah ok; you want to the ordering of the x-factor *within* the levels of the facet factor, but that is not something that factors allow for. Recall that one is ordering the levels of the entire factor variable. This may warrant it's own question on [so] – Gavin Simpson Mar 26 '18 at 15:30