1

Update: The major difference which is non-intuitive to me compared to the question answered in Order Bars in ggplot2 bar graph is re-ordering each individual panel's type separately according the the median calculated by ggplot2, while preserving the colour scheme.

In the following dataset, different treatments may show different arrangements of types along the x-axis when their median values are calculated.

I am trying to produce a plot showing how the median values of the types swap positions when arranged in ascending median values

To begin, I am plotting all the different treatments into separate panels with boxplots, and furthermore I am letting ggplot2 assign the colour palette for the types.

This will produce a mock-up of the dataset I have.

df = data.frame(
  type = rep(c
             (
             "piece", 
             "work",
             "man",
             "noble",
             "reason",
             "infinite",
             "faculty",
             "form",
             "moving",
             "express",
             "admirable",
             "action",
             "apprehension",
             "god",
             "beauty",
             "world",
             "paragon",
             "animals"
             ), 52),
  treatment = sample(x=c("alpha", "beta", "charlie", "delta"), 234, replace = T),
  value = abs(rnorm(936, 0.5, 1))
)

And this code will generate the graph that I am nearly looking for

ggplot(
  data=df,
  aes(x = type, y = value, fill = type)) +
  geom_boxplot() +
  xlab("Soliloquy of Man") +
  ylab("Value")+
  facet_wrap(~treatment, ncol = 2) +
  theme(axis.text.x = element_blank(),
        panel.background = element_blank(),
        panel.border = element_rect(colour = "black", fill=NA))

Boxplot that aren't ordered

What I would like to do is to

  1. reorder all the traits in each individual panel so that the median values increase along the x-axis,
  2. using the same colour palette for each specific type,
  3. for the first panel in the top left hand corner, fix the colour to each type such that it looks like a smooth transition from one end of the colour spectrum to the other.

This will allow me to check and show visually whether the types have swapped positions in reference to the first panel, following which I shall run a non-parametric rank test.

Community
  • 1
  • 1
Rewarp
  • 220
  • 1
  • 11
  • You just need to `reorder` your factor. This is an r-faq. See., e.g., Alex's answer here: http://stackoverflow.com/a/9231857/903061. Instead of length, you can use median as the ordering function. – Gregor Thomas Mar 04 '16 at 21:29
  • I tried that, but I couldn't get the syntax right for my specific case. – Rewarp Mar 04 '16 at 21:44
  • @Gregor But `aes(x = reorder(type, value, median)...` would not account for the facets, would it? – lukeA Mar 04 '16 at 21:44
  • 1
    Hmm, I was too hasty. I missed the different axis order by facet part. My guess is this is *not possible* with faceting, but you could construct each plot individually and stick them together with `grid.arrange`. – Gregor Thomas Mar 04 '16 at 21:53
  • @Gregor. Yes, I did do that, though ggplot2 has yet to disappoint me with its depth of options and functions that I didn't really want to use the clunky solution that was to individually subset and plot the data. Hoping for an elegant solution that someone else may have thought of. – Rewarp Mar 04 '16 at 22:02

1 Answers1

1

In ggplot2, you can take advantage of treating each position of the boxes on the x-axis as a number ranging from 1 to the number of categories.

Using your data set, but keeping your column type initially as a character.

library("ggplot2")
library("dplyr")

set.seed(12345)

df = data.frame(
  type = rep(c
  (
  "piece", 
  "work",
  "man",
  "noble",
  "reason",
  "infinite",
  "faculty",
  "form",
  "moving",
  "express",
  "admirable",
  "action",
  "apprehension",
  "god",
  "beauty",
  "world",
  "paragon",
  "animals"
  ), 52),
 treatment = sample(x=c("alpha", "beta", "charlie", "delta"), 234, replace = TRUE),
 value = abs(rnorm(936, 0.5, 1)), stringsAsFactors = FALSE)

In order to get the position of each of the types, get the median values for the column values in your data frame for every combination of type and treatment, and then rank these to get the plot order in each panel.

df2 <- df %>% group_by(treatment, type) %>% 
  summarise(med = median(value)) %>% mutate(plot_order = rank(med))

Join the plot order data back to the original data set.

df3 <- df %>% left_join(df2) %>% arrange(treatment, plot_order)

Extract the order of type in the first panel, and use these to order the levels of the factor.

treatment_a_order <- unique(df3[df3$treatment == "alpha", "type"])

Re-code the levels of type based on these re-ordered factors

df4 <- mutate(df3, ftype = factor(type, levels = treatment_a_order))

ggplot(df4, aes(x = plot_order, y=value, fill=ftype)) + 
  geom_boxplot() +
  facet_wrap(~treatment) + 
  xlab("Soliloquy of Man") +
  ylab("Value")+
  theme(axis.text.x = element_blank(),
  panel.background = element_blank(),
  panel.border = element_rect(colour = "black", fill=NA))

enter image description here The one caveat to this approach is that all the levels of your type column have to appear in the first panel.

  • Thanks for the help. My actual dataset has NA values in 3 of the 4 types, and as you suggested, I am basing the graph on the treatment which luckily has values for all types, so df2 is throwing in NAs to the dataset. How would I tell dplyr to ignore the NA values? – Rewarp Mar 07 '16 at 19:47
  • 1
    Figured it out. Needed to only figure out where to put `na.rm = T` with the `median` command. – Rewarp Mar 07 '16 at 22:47