2

I have the following current output:

And I am aiming for a colouring like this, but only filled until the maximum level (e.g the fill stops at the level present):

The data to create this, is:

df <- tribble(~Question_Code,   ~RespondentLevel,
"Engagement - Inclusion",   5,
"External engagement - policies",   2,
"External engagement - technology", 5,
"Community data ",  5,
"Internal engagement",  5,
"Internal use of technology",   4,
"Familiarity/Alignment",    5,
"Environmental impacts",    5,
"Innovation",   2,
"Use of open-source technology",    2,
"Regulation of hardware & software",    5,
"In-house technical capacity",  5,
"Infrastructure procurement",   5,
"Algorithmic Error & Bias", 2,
"Control: Privacy", 5,
"Accountability in Governance Structures",  3,
"Open procurement", 5,
"Use in decision-making",   1,
"Accountability",   1,
"External Control", 4,
"Internal Control", 2,
"Open Data",    2)
levels <-  c("Open Data","Internal Control","External Control","Accountability",
             "Use in decision-making","Open procurement","Accountability in Governance Structures","Control: Privacy",
             "Algorithmic Error & Bias","Infrastructure procurement","In-house technical capacity",
             "Regulation of hardware & software","Use of open-source technology","Innovation",
             "Environmental impacts","Familiarity/Alignment",
             "Internal use of technology","Internal engagement","Community data",
             "External engagement - technology","External engagement - policies","Engagement - Inclusion")

df <- df %>% mutate(Domain = c(as.character((rep("Domain 1", 5))),
                  as.character(rep("Domain 2", 4)),
                  as.character(rep("Domain 3", 6)),
                  as.character(rep("Domain 4", 7))))

And for the ggplot:

df %>% 
ggplot(aes(x = RespondentLevel, y = fct_rev(Question_Code))) +
  geom_tile() +
  theme_minimal(16)

The colours to fill, I'm using:

with each colour corresponding to a domain, and each shade to a level:
Greens <- c("#edf8e9", "#bae4b3", "#74c476", "#31a354", "#006d2c")

Reds <- c("#fee5d9", "#fcae91", "#fb6a4a", "#de2d26", "#a50f15")

Yellows <- c("#ffffeb","#ffff9d","#ffff89", "#ffff4e", "#ffff14")

Blues <- c("#eff3ff","#bdd7e7","#6baed6","#3182bd",  "#08519c")

EDIT: geom_bar also does the trick, but not broken down by gradient. Trying to use this function:

ColourPalleteMulti <- function(df, group, subgroup){

  # Find how many colour categories to create and the number of colours in each
  categories <- aggregate(as.formula(paste(subgroup, group, sep="~" )), df, function(x) length(unique(x)))
  category.start <- (scales::hue_pal(l = 100)(nrow(categories))) # Set the top of the colour pallete
  category.end  <- (scales::hue_pal(l = 40)(nrow(categories))) # set the bottom

  # Build Colour pallette
  colours <- unlist(lapply(1:nrow(categories),
                           function(i){
                             colorRampPalette(colors = c(category.start[i], category.end[i]))(categories[i,2])}))
  return(colours)
}

colours <- ColourPalleteMulti(df, "Domain", "RespondentLevel") 
df %>% 
  ggplot(aes(x = fct_rev(Question_Code), y = RespondentLevel))+
  geom_bar(stat = "identity", aes(fill = Domain), alpha = .9) +
  coord_flip() +
  theme_minimal(16)+
  xlab(" ") +
  ggtitle("Baseline Report Card Sample Community")+
  scale_fill_manual("RespondentLevel", values = colours)+
  theme(legend.title = element_text(size = 14),
        legend.position = "none",
        legend.text = element_text(size = 14),
        plot.title = element_text(size=18, hjust = 0.5),
        plot.caption = element_text(size = 12, hjust = 1),
        axis.text.y = element_text(hjust = 0),
        panel.grid = element_line(colour = "#F0F0F0"),
        plot.margin = unit(c(1,1,0.5,1), "cm"))

enter image description here

Sorry for the long reprex, can adjust if possible

Corey Pembleton
  • 717
  • 9
  • 23
  • There have been a few SO posts with tricks for doing this, such as https://stackoverflow.com/questions/46333719/geom-tile-different-gradient-scale-and-color-for-different-factors?rq=1 or https://stackoverflow.com/q/49863210/5325862 – camille Mar 14 '19 at 16:43
  • Not sure I understand what you mean by "filled until the maximum level". Do you want shading of those colours across the whole plot, behind the data? – phalteman Mar 14 '19 at 16:43
  • @camille thank you for sharing: using `scale_fill_manual(values=c(dna="salmon", rna="steelblue")) ` is good for specific values, is there a way to do so grouped and with gradient? – Corey Pembleton Mar 14 '19 at 16:46
  • @phalteman yes, for example if a variable as value "4", the shading from 1:3 should also be filled, with their respective gradient. I thought that due to the gradient colouring, a flipped `geom_bar` wouldnt be able to do the trick. values where the score is lower (e.g. 2) won't have 3-5 filled – Corey Pembleton Mar 14 '19 at 16:48
  • I'm confused about how you want to structure the y-axis. In the Excel-type example, there isn't really a relationship between the groups listed on the y-axis and the levels that appear with them in the data – camille Mar 14 '19 at 16:52
  • @camille, the y will be structured by the "domains" , par example. The output is a rubric, and each question fits within a group ("domain). The order is less important, but the colour will the same across the domain, e.g. all "question code" within "Domain 1" will be filled green – Corey Pembleton Mar 14 '19 at 16:55
  • @camille, I've added in a geom_bar, but it isn't the aesthetic exactly I'm aiming for. Will continue playing with that function to create gradient. – Corey Pembleton Mar 14 '19 at 18:07

1 Answers1

5

Here are a few options for tricks. First off, to get the full set of levels for each question so you don't have gaps in your data, I used tidyr::complete. That's the data frame I'll be working with.

library(ggplot2)
library(dplyr)
library(tidyr)
library(purrr)
library(patchwork)

df_full <- df %>%
  complete(nesting(Domain, Question_Code), RespondentLevel) %>%
  mutate(RespondentLevel = as.character(RespondentLevel)) 

The easier option is to approximate the gradients with changing the alpha, and setting the hue (red, green, etc) based on domain. This forfeits the other colors you've chosen, and just uses the last, darkest color of each palette.

To do this, I made a list of all your palettes. In setting the fill, map_chr(palettes, 5) extracts the 5th element of each list, which is the darkest color of each. You'll probably want to adjust or remove one or both of the legends.

palettes <- list(Greens, Reds, Yellows, Blues)

ggplot(df_full, aes(x = RespondentLevel, y = Question_Code, fill = Domain, alpha = RespondentLevel)) +
  geom_tile() +
  theme_minimal() +
  facet_grid(rows = vars(Domain), scales = "free", space = "free") +
  scale_fill_manual(values = map_chr(palettes, 5))
#> Warning: Using alpha for a discrete variable is not advised.

The more difficult way splits the data by domain and makes a list of plots, then puts them together with the patchwork package. The benefit is that you can keep the full color palettes, but the downside is that it's more difficult to control things like sizing that you get from facet_grid, which adjusts for the fact that there are more questions listed in some domains than in others. You could resize these by hand in plot_layout if you think this approach is worthwhile. You'll also need to adjust some theme elements to mimic what facet_grid would do.

plot_list <- df_full %>%
  split(.$Domain) %>%
  map2(palettes, function(domain_df, pal) {
    ggplot(domain_df, aes(x = RespondentLevel, y = Question_Code, fill = RespondentLevel)) +
      geom_tile() +
      theme_minimal() +
      scale_fill_manual(values = pal) +
      theme(legend.position = "none") +
      labs(x = NULL, y = NULL)
  })

reduce(plot_list, `+`) +
  plot_layout(ncol = 1)

Note that normally, patchwork puts plots together like plot1 + plot2 to mimic ggplot layering. Since I had the plots in a list, I did this with purrr::reduce.

camille
  • 16,432
  • 18
  • 38
  • 60
  • Brilliant, thank you @camille I never would have been able to figure this out, I see that geom_tile does work, and looks far better than geom_bar. It works perfectly with my actual dataset. – Corey Pembleton Mar 14 '19 at 18:57