2

I'm combining two layers in ggplot that were created from two different data sets and want to control the order in which the legend appears.

With example data and code:

base <- 
data.frame(idea_num = c(1, 2), 
           value = c(-50, 90), 
           it_cost = c(30, 10))

group <- 
data.frame(idea_num = c(1, 1, 2, 2), 
           group = c("a", "b", "a", "b"), 
           is_primary = c(TRUE, FALSE, FALSE, TRUE), 
           group_value = c(-40, -10, 20, 70))

base %>% 
left_join(group) %>%
arrange(desc(value)) %>%
mutate(idea_num = idea_num %>% factor(levels = unique(idea_num)), 
       is_primary = is_primary %>% factor(levels = c("TRUE", "FALSE"))) %>%
ggplot(aes(x = idea_num, y = group_value, fill = is_primary)) +
geom_bar(stat = "identity") +
geom_bar(data = base %>% 
             arrange(desc(value)) %>% 
             mutate(idea_num = idea_num %>% factor(levels = unique(idea_num))),
         aes(x = idea_num, y = it_cost, alpha = 0.1, fill = "it_cost"), 
         stat = "identity") +
scale_fill_manual(name = "Group", labels = c("TRUE" = "Primary", "FALSE" = "Secondary", "it_cost" = "IT Cost"), 
                  values = c("TRUE" = "blue", "FALSE" = "red",  "it_cost" = "black")) +
scale_alpha(guide = "none") +
theme(legend.position = "bottom")

I get a figure

enter image description here

but I'd like the legend to appear in the order of Primary, Secondary, IT Cost.

Were all of the numbers I'm trying to plot part of the same grand number, I could easily melt the dataframe and sum everything; however, the values from the group$group_value need to be displayed separate from base$it_cost.

If I plot only the values from teh first layer, i.e.,

base %>% 
left_join(group) %>%
arrange(desc(value)) %>%
mutate(idea_num = idea_num %>% factor(levels = unique(idea_num)), 
       is_primary = is_primary %>% factor(levels = c("TRUE", "FALSE"))) %>%
ggplot(aes(x = idea_num, y = group_value, fill = is_primary)) +
geom_bar(stat = "identity") +
scale_fill_manual(name = "Group", labels = c("TRUE" = "Primary", "FALSE" = "Secondary"), 
                  values = c("TRUE" = "blue", "FALSE" = "red")) +
theme(legend.position = "bottom")

I get a figure I expect

enter image description here

How can I add the second layer and adjust the ordering of the legend boxes? I do not believe that this question or this question are entirely relevant to mine as the former is dealing with levels of a factor and the latter deals with ordering of multiple legends.

Can I do what I'd like to do? Is there a better way of constructing this plot?

Steven
  • 3,238
  • 21
  • 50

2 Answers2

1

use scale_fill_manual(..., limit=, ...):

... +
  scale_fill_manual(name = "Group",
                    labels = c("TRUE" = "Primary", "FALSE" = "Secondary", "it_cost" = "IT Cost"), 
                    limits = c("TRUE", "FALSE", "it_cost"), 
                    values = c("TRUE" = "blue", "FALSE" = "red",  "it_cost" = "black")) +
  ...

This gives:

enter image description here

That said, I think you may want to consider a few different approaches:

A: why do you create your data in such a complex way, ending up multiple observations of IT Costs for the same idea number? I don't know your data, you may well have your reasons, but a simple dataset along the lines:

  idea_num value      type
1        1   -40   Primary
2        1   -10 Secondary
3        2    20 Secondary
4        2    70   Primary
5        1   -50   IT Cost
6        2    90   IT Cost

would simplify the things quite a bit.

B: Why do you want to stack/overplot these two separate barplots? I would do position="dodge" instead to have separate bars.

Ott Toomet
  • 1,894
  • 15
  • 25
  • Sadly, I can't change how the data come to me. Also, using `position = "dodge"` would make the figure nearly illegible on the x-axis with my actual data. – Steven Oct 23 '17 at 21:02
  • Sure. I thought you have your reasons. It would be rather easy to convert your example data into the table above, and then it is pretty trivial to have all three bars dodged next to each other. – Ott Toomet Oct 23 '17 at 21:36
0
df2 <- base %>% 
  left_join(group) %>% 
  mutate(is_primary=paste0("pri_", is_primary+0)) %>%
  spread(is_primary, group_value) %>%
  gather(yvar, y, it_cost, pri_0, pri_1)

df2$yvar <- factor(df2$yvar, levels=c("pri_0", "pri_1", "it_cost"), 
             labels=c("Primary", "Secondary", "IT Cost")) 
df2$idea_num <- factor(df2$idea_num, levels=c(2, 1))

ggplot(df2, aes(idea_num, y, fill=yvar)) + 
  geom_bar(stat="identity") +
  scale_fill_manual("Group", values=c("blue", "red", "black")) +
  scale_alpha(guide = "none") +
  theme(legend.position = "bottom")

enter image description here

Adam Quek
  • 6,973
  • 1
  • 17
  • 23