0

I have a dataframe that is a summary of a larger dataset that i tried to replicate below. I set the score column as a factor so that the naming in the plots is correct.

I want to sort this dataframe on the score group, and within the group on the count column (n). Hence, the order i would like to show my horizontal bars using ggplot: from min to max (bottom to top of graph) on the X-axis (or Y in the output since its flipped/horizontal bars), and within a score group, starting from the min score group, a descending order of count (n). The same order should be preserved for the next group (i.e. low), but theme's that didn't appear in the previous score could be inserted accordingly to their count (n) value.

I tried sorting my dataframe, but my results are not what i expect. for example, rows 5 and 6 should be switched in my sorted dataframe since cos appeared before foo in the previous score (i.e. min). I tried changing the factor levels order using reorder and also with forcats, but to no extent...

require(tidyverse)

df = tribble(
~score, ~ theme, ~ n,
5, "foo", 1,
5, "bar", 1,
4, "let", 3,
3, "let", 1,
3, "cos", 1,
3, "foo", 2,
2, "foo", 3,
2, "let", 4,
2, "cos", 5
)

data = df %>%
      group_by(score, theme) %>%
      arrange(desc(score), n) %>%
      mutate_at("score", function(x) factor(x, levels = c(1, 2, 3, 4, 5), labels = c("min", "low", "avg", "high", "max")))
data

plot = data %>%
      ggplot(mapping = aes(x = score, y = n, fill = theme)) +
      geom_col(position = position_dodge2(width = 0.9, preserve = "single")) +
      coord_flip() +
      scale_y_continuous(expand = expansion(mult = c(0, .1))) +
      guides(fill = guide_legend(ncol = 2, byrow = TRUE)) +
      labs(y = "n", x = "scoring", fill = "vars")

plot

My expected graph would be:

MAX BAR (<- unsure since equal)
MAX FOO
HIGH LET
AVG FOO
AVG LET
AVG COS
LOW FOO
LOW LET
LOW COS

1 Answers1

1

You factord the score but not the theme. Know this, though: ggplot2 is going to order them from the y-axis origin, so your order of "BAR before FOO" is better stated "BAR above FOO" or "BAR after FOO", which means in factors "FOO before BAR".

df$theme <- factor(df$theme, levels = rev(c("bar", "foo", "let", "cos")))
# run the 'data' and 'plot' code, unchanged

(It's not strictly necessary to use rev here, its use is purely demonstrative, declaring that the order of what we think is more important -- "bar" on top -- is opposite the direction ggplot is using.)

enter image description here

If you want the same colors as before my change, then add

... +
  scale_fill_manual(values = c(bar="#F8766D", foo="#00BFC4", cos="#7CAE00", let="#C77CFF"))

(I derived the colors by using gg_color_hue(4) and then reordering the values= vector to get it right.)

r2evans
  • 141,215
  • 6
  • 77
  • 149
  • Thanks, i did not know i had to factor the "fill" property, i.e. the theme. Is there a more elegant solution other than providing a vector with all the names of the themes? I generate these graphs in a pwalk so this seems too hardcoded for my case. – Thibault Fouquaert May 04 '21 at 15:20
  • With this data, I see no obvious way to programmatically infer which should be ordered first: not alphabetic, not by count. – r2evans May 04 '21 at 15:29
  • I seem to get it working using `mutate_at("theme", function(x) factor(x, levels = rev(unique(x))))` . Maybe you know why, because i'm left even more confused now. If you might have an idea, add it to your answer and i'll gladly accept it as an answer. Thanks! – Thibault Fouquaert May 04 '21 at 15:42
  • That works, but it is fragile: it is dependent on the order of data as it appears in the frame. You assume that `unique` will put things in the order you want, but realize that `unique(c(1,2,3))` and `unique(c(2,3,1))` produce different results. If the order of how they appear in that frame is constant, then this should work fine. – r2evans May 04 '21 at 15:55
  • I don't feel comfortable with the sample data to fixate on (what I only know as) arbitrary ordering with a sample frame. You asked about ordering the bars, and that's been addressed. I understand that the underlying problem is how to order `theme`s, so if you really feel that you need to resolve that, then either ask a new question (and please accept this answer) or edit your question here (and lacking more context, there's not much I can do for this answer). – r2evans May 04 '21 at 15:59