1

So I thought I had my question answered with my prior question, but alas something is still not working. I am wondering if there is something in the structure of my data that I am missing because my prior question with fake data worked.

Here is a new reproducible example, with dummy data that more closely replicates my data set and my problem. My question is how do I sort the segments within each bar of the bar chart by value (largest value within a bar on the bottom, smallest on top).

library(dplyr)

repro_df <- structure(list(Grp = structure(c(5L, 7L, 2L, 3L, 8L, 7L, 10L, 
                                             4L, 4L, 3L, 2L, 2L, 3L, 8L, 9L, 3L, 3L, 6L, 6L, 5L, 6L, 8L, 4L, 
                                             11L, 5L, 1L, 10L, 8L, 1L, 6L, 3L, 1L, 1L, 9L, 5L, 3L, 5L, 4L, 
                                             5L, 5L, 2L, 1L, 9L, 4L, 5L, 10L, 6L, 8L, 3L, 6L, 2L, 6L, 4L, 
                                             7L, 2L, 8L, 9L, 9L, 10L, 5L, 1L, 9L, 1L, 5L, 2L, 8L, 8L, 3L, 
                                             3L, 10L, 7L, 6L, 9L, 2L, 9L, 7L, 1L, 1L, 9L, 1L, 11L, 10L, 9L, 
                                             3L, 7L, 2L, 4L, 7L, 6L, 6L, 4L, 8L, 5L, 5L, 7L, 10L, 8L, 3L, 
                                             6L, 3L, 10L, 10L, 7L, 8L, 9L, 8L, 5L, 7L, 3L, 10L, 11L, 7L, 4L, 
                                             10L, 3L, 8L, 5L, 3L, 5L, 4L, 3L, 10L, 7L, 3L, 4L, 9L, 2L, 3L, 
                                             2L, 1L, 8L, 11L, 2L, 1L, 7L), .Label = c("0", "1", "2", "3", 
                                                                                      "4", "5", "6", "7", "8", "9", "10"), class = "factor"), Segment = structure(c(1L, 
                                                                                                                                                                    2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 
                                                                                                                                                                    3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 
                                                                                                                                                                    1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 
                                                                                                                                                                    2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 
                                                                                                                                                                    3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 
                                                                                                                                                                    1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 
                                                                                                                                                                    2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 
                                                                                                                                                                    3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 
                                                                                                                                                                    1L, 2L, 3L, 1L, 2L, 3L), .Label = c("A", "B", "C"), class = "factor"), 
                           Value = c(914, NA, NA, 228, NA, NA, NA, 207, NA, 179, NA, 
                                     NA, 149, NA, NA, 135, NA, NA, NA, 109, NA, NA, 105, NA, NA, 
                                     101, NA, 100, NA, NA, NA, 98, NA, 96, NA, NA, 87, NA, NA, 
                                     77, NA, NA, 74, NA, NA, 57, NA, NA, 49, NA, NA, 35, NA, NA, 
                                     31, NA, NA, 25, NA, NA, NA, 25, NA, NA, 21, NA, 18, NA, NA, 
                                     16, NA, NA, 8, NA, NA, 7, NA, NA, 7, NA, NA, 5, NA, NA, NA, 
                                     NA, 4, NA, NA, 0, 0, NA, NA, 0, NA, NA, 0, NA, NA, NA, NA, 
                                     0, 0, NA, NA, NA, NA, 0, 0, NA, NA, 0, NA, NA, NA, NA, 0, 
                                     NA, NA, 0, NA, NA, 0, 0, NA, NA, NA, 0, NA, NA, NA, 0, NA, 
                                     NA, 0)), class = "data.frame", row.names = c(NA, -135L))

# Reorder the data frame
repro_order_df <- repro_df %>% 
  group_by(Segment) %>% 
  mutate(Grp = fct_reorder(Grp, Value))

head(repro_order_df, 10)
# A tibble: 10 x 3
# Groups:   Segment [3]
   Grp   Segment Value
   <fct> <fct>   <dbl>
 1 4     A         914
 2 6     B          NA
 3 1     C          NA
 4 2     A         228
 5 7     B          NA
 6 6     C          NA
 7 9     A          NA
 8 3     B         207
 9 3     C          NA
10 2     A         179

# Plot
ggplot(repro_order_df, aes(x=Segment, y=Value, fill=Grp)) +
  geom_col(color = "black")

When I graph this data after reordering, each bar is not ordered by Value as I would have expected. A bit more oddly, in my real data set the first bar is ordered correctly but the following bars are not. Any thoughts as to why this is not working? plot

Thanks!

DaveM
  • 664
  • 6
  • 19

2 Answers2

2

How about this, which I think is what you are after...

The trick is to utilise the group aesthetic combined with an additional grouping variable to control the plotting order and use the Grp variable to control the fill colours.


library(dplyr)
library(ggplot2)
library(forcats)

Option 1) show merged groups in value order

create a new grouping variable to order the groups by segment and group size


repro_order_df <- 
  repro_df %>% 
  group_by(Segment, Grp) %>%
  summarise(Value = sum(Value, na.rm = TRUE)) %>% 
  ungroup() %>% 
  group_by(Segment) %>% 
  arrange(Value) %>% 
  mutate(g = row_number()) 

p1 <- 
  ggplot(repro_order_df, aes(x = Segment, y = Value, group = g, fill = Grp)) +
  geom_col(color = "black") +
  ggtitle("p1 grouped by Grp") +
  theme(legend.position = "bottom")

Option 2) show groups in value order with individual group values stacked largest first

create a new grouping variable to order the groups by segment and group size and value within group

repro_order_df1 <- 
  repro_df %>% 
  group_by(Segment, Grp) %>%
  mutate(Value_g = sum(Value, na.rm = TRUE)) %>% 
  ungroup() %>% 
  group_by(Segment) %>% 
  arrange(Value_g, Value) %>% 
  mutate(g = row_number()) 


p2 <- 
  ggplot(repro_order_df1, aes(x = Segment, y = Value, group = g, fill = Grp)) +
  geom_col(color = "black") +
  ggtitle("p2 grouped by Grp and Value") +
  theme(legend.position = "bottom")

Which give you:

Created on 2020-05-16 by the reprex package (v0.3.0)

Peter
  • 11,500
  • 5
  • 21
  • 31
2

I think Peter is definitely on the right track. However, I understand the OP to be asking for the individual Values be ordered by Value within each Segment. I've made Grp a factor such that Grp is ordered in decreasing size of the largest Value in a Grp. The code would look like:

  repro_ord <- repro_df %>% arrange(desc(Value)) %>% 
            mutate(Value_ord = row_number(), Grp = as_factor(as.character(Grp)) )
  p <- ggplot(repro_ord, aes(x = Segment, y = Value) ) +
            geom_col( aes( fill = Grp, group = rev(Value_ord) ),color = "black")

  bar_tot <- repro_ord %>% group_by(Segment) %>% summarize(Total = sum(Value, na.rm = TRUE)) %>% 
          ungroup() %>% mutate_if(., is.numeric, round, 0) 

  p1 <- p + geom_text(data = bar_tot, aes(x=Segment, y = Total, label = Total), vjust = -0.5 ,
                   size = 3, hjust = 0.5, fontface = "bold" )

which gives.

enter image description here

This answer should be regarded as a long comment on Peter's answer rather than a new answer.

WaltS
  • 5,410
  • 2
  • 18
  • 24
  • I think your answer is correct, having reread the OP question, It is very generous of you to consider your answer a comment; However, you should take the credit; What's more your answer is less convoluted! – Peter May 17 '20 at 07:28
  • One complication--I created a separate df to label the bar totals. When I go to add that to the plot, I get an error saying `Value_ord not found`. The code: `bar_tot <- repro_ord %>% group_by(Segment) %>% summarize(Total = sum(Value, na.rm = TRUE)) %>% ungroup() %>% mutate_if(., is.numeric, round, 0) ` And then to plot: `p1 <- p1 + geom_text(data = bar_tot, aes(Segment, Total, label = Total, vjust = -0.5 ), size = 3, hjust = 0.5, fontface = "bold" ) ` – DaveM May 17 '20 at 12:24
  • I've edited the answer to include the totals at the top of the bars. You needed to move `fill` and `group` from `ggplot` to `geom_col` to keep them from being inherited by `geom_text`. – WaltS May 17 '20 at 14:05
  • Brilliant. Thank you. I tried `inherit.aes = FALSE` but must have had the syntax wrong. Much appreciated. – DaveM May 17 '20 at 22:23