0

I am trying to make a grouped bar chart in which the bars are colored based on one variable(binary/ e.g. Group 1 and group2), and then the transparency of the bars are based on another value(continuous/ e.g. p-value), but I want the transparency to be specific to each groups color, and I want the gradient to and legend to be continuous.

I have been able to get close using the color, group, and fill options in geom_bar. You will see that I can get the over all gradient to work and the outlines of the bars are colored correctly. But I would like the fill to be the colors of the outlines and retain the transparency. I also tried using scale_alpha, which maps the transparencies correctly, but doesn't produce a continuous legend.

Here is a small data set like the one I am working with

## data set
d <- data.frame(ID = rep(c(123, 456), 2),
                description = rep(c("cancer", "infection"), 2),
                variable = c("G2", "G2", "G1", "G1"),
                value = c(1.535709, 1.582127, 4.093683, 4.658328),
                pvals = c(9.806872e-12, 1.160182e-09, 3.179635e-05, 1.132216e-04))

Here is the ggplot code

ggplot(d, aes(x=reorder(description, -pvals), y=value)) +
  geom_bar(stat="identity", aes(col=variable, group=variable, fill=pvals), position="dodge") +
  ylim(0, max(d$value) + 0.6) + xlab("") +
  coord_flip() +
  scale_fill_brewer(palette = "Set1",
                    name="",
                    breaks=c("G1", "G2"),
                    labels=c("Group 1", "Group 2")) +
  scale_fill_continuous(trans = 'log10') # I am using log10 transformation because I have many small p-values and this makes the shading look better

Here is attempt 2 where the fill works but the legend does not.

ggplot(d, aes(x=reorder(description, -pvals), y=value)) +
  geom_bar(stat="identity", aes(fill=variable, alpha = pvals), position="dodge") +
  ylim(0, max(d$value) + 0.6) + xlab("") +
  coord_flip() +
  scale_fill_brewer(palette = "Set1",
                    name="",
                    breaks=c("G1", "G2"),
                    labels=c("G1", "G2")) +
  scale_alpha(trans = "log10")
Harry Smith
  • 267
  • 1
  • 11
  • It looks like for the first fill scale, you actually meant it to be a color scale – camille Aug 22 '19 at 16:32
  • Adding `alpha = pvals` gets a separate legend for alpha, and adding `guide = guide_colorbar()` to the alpha scale throws the error that alpha can't have a colorbar guide. I'm guessing that's because it would be difficult to build an alpha gradient..? You could instead use a discrete guide for your continuous fill, which will combine alpha & fill guides. Here's a deep dive into that issue: https://stackoverflow.com/q/44168996/5325862 – camille Aug 22 '19 at 16:39
  • 2
    Just a friendly reminder that `geom_col()` is a shortcut for `geom_bar(stat = "identity")`. – teunbrand Aug 22 '19 at 16:44
  • Rereading, I'm confused now: you say you want the alpha based on group color, which is discrete, but you also say you want a continuous guide for alpha. Which is it exactly, and are you mixing up color and fill? – camille Aug 22 '19 at 16:48
  • @camille, assumed the intention is to show there is a statistical difference between the two groups (using the pvals legend) and to differentiate the groups (G1 and G2 legend). Not sure what else is to see from the plot. – deepseefan Aug 22 '19 at 16:55
  • @deepseefan. First, thank you camille for the link and thank you for uploading this image. I have been able to get to this point. What I would like though is for the group 2 bar's fill to be #00FFFF and group 1's fill to be #800020, but for them to retain their shading based on the continuous p-value scale. In other words, group 2's fill would be a gradient of #00FFFF and group 1a gradient of #800020 base don p-value. My apologies for the confusion. – Harry Smith Aug 22 '19 at 17:07
  • 1
    A few older posts come up with hacks that might be related; here are a few https://stackoverflow.com/q/13016022/5325862, https://stackoverflow.com/q/49818271/5325862, https://stackoverflow.com/q/50163072/5325862 – camille Aug 22 '19 at 18:26

2 Answers2

2

I've come up with an ugly hack, but it works so here we are. The idea is to first plot your plot as you would per usual, take the layer data and use that as input in a new plot. In this new plot, we make two layers for G1 and G2 and use the ggnewscales package to map these layers to different aesthetics. There are a few caveats I'll warn about.

First, we'll make a plot and save it as a variable:

g <- ggplot(d, aes(x=reorder(description, -pvals), y=value)) +
  geom_bar(stat="identity", aes(col=variable, group=variable, fill=pvals), position="dodge") +
  ylim(0, max(d$value) + 0.6) + xlab("") +
  coord_flip() +
  scale_fill_brewer(palette = "Set1",
                    name="",
                    breaks=c("G1", "G2"),
                    labels=c("Group 1", "Group 2")) +
  scale_fill_continuous(trans = 'log10')

Next, we'll take the coordinates of this layers data and match them back to the original data. Note that this highly dependent on having unique y-values in your original plot, but I suppose you could also figure this out in other ways.

ld <- layer_data(g)
ld <- ld[, c("xmin", "xmax", "ymin", "ymax")]

# Match back to original data
matches <- match(ld$ymax, d$value)

# Supplement with original data
ld$pvals <- log10(d$pvals[matches])
ld$descr <- d$description[matches]
ld$vars <- d$variable[matches]

Now we'll make a new plot with geom_rects as layers, separated by the vars. In between these layers, we the first fill scale for G1 and use the new_scale_fill() afterwards. Afterwards, we'll do the second geom_rect() and the second fill scale. Then we'll muddle around with the x-axis to have it resemble the original plot somewhat.

library(ggnewscale)

ggplot(mapping = aes(xmin = xmin, xmax = xmax, ymin = ymin, ymax = ymax)) +
  geom_rect(data = ld[ld$vars == "G1", ], aes(fill = pvals)) +
  scale_fill_gradient(low = "red", high = "transparent", 
                      limits = c(min(ld$pvals), 0),
                      name = "Log10 P-values G1") +
  new_scale_fill() +
  geom_rect(data = ld[ld$vars == "G2", ], aes(fill = pvals)) +
  scale_fill_gradient(low =  "blue", high = "transparent", 
                      limits = c(min(ld$pvals), 0),
                      name = "Log10 P-values G2") +
  scale_x_continuous(breaks = seq_along(unique(d$description)),
                     labels = c("cancer", "infection")) +
  coord_flip()

enter image description here

And that's the ugly hack. I might have the x-axis labels wrong, but I've found no elegant way to automatically reproduce the x-axis labels without the code getting too long.

Note: ggnewscales is known to throw errors in older versions of R, but if you use the github version they've fixed that error.

teunbrand
  • 33,645
  • 4
  • 37
  • 63
  • Thank you! I can at least work with this hack, since I have found no way using just ggplot2. You are correct that the x-axis (infection, cancer) are reversed, but again, I think I can deal with this. – Harry Smith Aug 22 '19 at 17:32
0

To make the script less verbose and the output is shown below if that is what you're after.

library(ggplot2)
base <- ggplot(d, aes(reorder(description, -pvals), value)) + geom_bar(stat = "identity", aes(col=variable, group=variable, fill=pvals), position = "dodge")

base_axes_flip <- base + ylim(0, max(d$value) + 0.6) + xlab("") + coord_flip()

bax_color <- base_axes_flip + scale_color_manual(values=c('#800020','#00FFFF'),
                        name="",
                        breaks=c("G1", "G2"),
                        labels=c("Group 1", "Group 2"))

# Note here the scale_color_manual

bax_color + scale_fill_continuous(trans = 'log10')

This produces the following output and hope it helps. bar_color_manual

deepseefan
  • 3,701
  • 3
  • 18
  • 31