2

I need to use gghighlight in a clustered bar chart in R in order to highlight only one single bar. My code and sample data looks like this:

library(tidyr)
library(ggplot2)
dat <- data.frame(country=c('USA','Brazil','Ghana','England','Australia'), Stabbing=c(15,10,9,6,7), Accidents=c(20,25,21,28,15), Suicide=c(3,10,7,8,6))
dat.m <- melt(dat, id.vars='country')
dat.g <- gather(dat, type, value, -country)
ggplot(dat.g, aes(type, value)) + 
  geom_bar(aes(fill = country), stat = "identity", position = "dodge") +
  gghighlight(type == "Accidents" & country == "Brazil")

But this gives me this awkward

graph

How can I get gghighlight to highlight only one single bar of one group (so combining two conditions for two discrete variables)?

tjebo
  • 21,977
  • 7
  • 58
  • 94
Ben
  • 23
  • 5
  • It seems to be related to the use of `dodge` only. Maybe a bug that have not yet been reported. – dc37 Mar 09 '20 at 22:49
  • 1
    I agree it looks as if this is related to dodging - but don't think this is a bug. I believe currently gghighlight is not *meant* to be used with dodge. None of the examples in the `gghighlight` [vignette](https://cran.r-project.org/web/packages/gghighlight/vignettes/gghighlight.html) contains dodge. May be a reason for a feature request. Or just plot it in good old separating the data way. – tjebo Mar 09 '20 at 23:20
  • 1
    @Tjebo, I agree. I mis-expressed myself when mentioning this as a bug. – dc37 Mar 09 '20 at 23:46

2 Answers2

4

Here are two alternative options for highlighting a single column in this type of plot:

1) make a new variable (named highlight below) and fill by that (and, if you like, use the line colors to color by country)

2) manually annotate the one column you want to highlight with an arrow and/or text (or work out how to automate the positioning, but that would be more involved) - could be an option for one final figure

library(tidyr)
library(ggplot2)
dat <- data.frame(country=c('USA','Brazil','Ghana','England','Australia'), 
    Stabbing=c(15,10,9,6,7), 
    Accidents=c(20,25,21,28,15), Suicide=c(3,10,7,8,6))
dat.m <- reshape2::melt(dat, id.vars='country')
dat.g <- gather(dat, type, value, -country)

## set highlighted bar
dat.g$highlight <- ifelse(dat.g$type == "Accidents" & dat.g$country == "Brazil", TRUE, FALSE)

## option 1: use fill to highlight, colour for country
ggplot(dat.g, aes(type, value, fill = highlight, colour=country), alpha=.6) + 
    geom_bar(stat = "identity", position = "dodge2", size=1) +
    scale_fill_manual(values = c("grey20", "red"))+
    guides(fill = FALSE) + 

    ## option 2: use annotate to manually label a specific column:
    annotate(geom = "curve", x = 1.15, y = 30, xend = 1.35, yend = 26, 
        curvature = .2, arrow = arrow(length = unit(2, "mm"))) +
    annotate(geom = "text", x = 1, y = 31, label = "Highlight", hjust = "left")

Created on 2020-03-10 by the reprex package (v0.3.0)

user12728748
  • 8,106
  • 2
  • 9
  • 14
  • That's very nice and also preferable over my approach. You don't even need to assign a new column but can fill by conditional statement directly. – tjebo Mar 10 '20 at 13:43
2

I think gghighlight is not built for this kind of plot - not yet! You could file a feature request ? It is a bit unclear though if this visualisation is very helpful. Gghighlight always draws everything - this makes the "weird" shadows when dodging.

If you want to keep using gghightlight, maybe try faceting, which they suggest in their vignette

A suggestion - Use facets:

(using mtcars as example)

library(tidyverse)
library(gghighlight)

mtcars2 <- mtcars %>% mutate(cyl = as.character(cyl), gear = as.character(gear))
ggplot(mtcars2, aes(cyl, disp, fill = gear))  +
  geom_col() + #no dodge
  gghighlight(cyl == "4") + #only one variable
  facet_grid(~ gear) #the other variable is here
#> Warning: Tried to calculate with group_by(), but the calculation failed.
#> Falling back to ungrouped filter operation...

Created on 2020-03-09 by the reprex package (v0.3.0)

Or, here without gghighlight, in a more traditional subsetting approach. You need to make a subset of data which contains rows for each group you want to dodge by, in this case "cyl" and "gear". I replace the irrelevant data with "NA", you could also use "0".

library(tidyverse)

mtcars2 <- mtcars %>% 
  mutate(cyl = as.character(cyl), gear = as.character(gear)) %>% 
  group_by(cyl, gear) %>% 
  summarise(disp = mean(disp))

subset_mt <- mtcars2 %>% mutate(highlight = if_else(cyl == '4' & gear == '3', disp, NA_real_))

ggplot()  +
  geom_col(data = mtcars2, aes(cyl, disp, group = gear), fill = 'grey', alpha = 0.6, position = 'dodge') +
  geom_col(data = subset_mt, aes(cyl, highlight, fill = gear), position = 'dodge') 
#> Warning: Removed 7 rows containing missing values (geom_col).

Created on 2020-03-10 by the reprex package (v0.3.0)

tjebo
  • 21,977
  • 7
  • 58
  • 94
  • Thanks for the suggestion! Unfortunatly, facets are not useful in my case, because I have e very complex bar chart, where several clusters of bars correspond to one overarching group. What do you mean by "Or just plot it in good old separating the data way" (I'm new to R, sorry)? – Ben Mar 10 '20 at 09:44