2

I'm trying to draw a plot that colors on two variables (a factor and an intensity). I would like each factor to be a different color, and I want the intensity to be a gradient between white and that color.

So far I've used techniques like facetting on the factor, setting color to be the interaction between the two variables, and setting color to the factor and alpha to intensity to approximate what I would like. However, I still feel like a gradient between white and the full color on one plot would represent this best.

Does anyone know how to do this without custom creating all of the color gradients and just setting them? In addition, is there a way to do it such that the legend works as if the graph was using color and alpha versus listing all of the colors as it does when setting color to the interaction?

So far I've tried:

ggplot(diamonds, aes(carat, price, color=color, alpha=cut)) +
  geom_point()

ggplot(diamonds, aes(carat, price, color=interaction(color, cut))) +
  geom_point()

ggplot(diamonds, aes(carat, price, color=color)) +
  geom_point() +
  facet_wrap(~cut)

What I'm trying to achieve is something that looks most like the graph using alpha, but instead of transparency, I would like a gradient between white and that color. In addition, I would like the legend to look like the one using color and alpha rather than the legend from for example the interaction plot.

jtanman
  • 654
  • 1
  • 4
  • 18
  • I'd go with the `facet_wrap` option and look at one of the `scale_color_gradient*` functions to get the desired color gradient. The same aesthetic for more than one variable is something that just doesn't work well in `ggplot`. – neilfws Jun 11 '19 at 22:59

1 Answers1

5

The approach I usually use is to manipulate the factor values so I can plug them into the hcl() function.

First, some raw data:

library(tidyverse)

raw_data <-
  diamonds %>% 
  filter(price < 500, color %in% c("E", "F", "G")) %>% 
  mutate(
    factor = factor(color),
    intensity = cut,
    interaction = paste(factor, intensity)
  )

Next use this kind of wrangling to get hex colors:

color_values <-
  raw_data %>%
  distinct(factor, intensity, interaction) %>%
  arrange(factor, intensity) %>%
  mutate(
    interaction = fct_inorder(interaction),
    # get integer position of factors
    factor_int = as.integer(factor) - 1,
    intensity_int = as.integer(intensity),
    # create equal intervals for color, adding in some padding so we avoid extremes of 0, 1
    hue_base = factor_int / (max(factor_int) + 0.5),
    light_base = 1 - (intensity_int / (max(intensity_int) + 2)),
    # using ^^^ to feed into hcl()
    hue = floor(hue_base * 360),
    light = floor(light_base * 100),
    # final colors
    hex = hcl(h = hue, l = light)
  )

color_values %>% filter(intensity == "Good")
#  factor intensity interaction factor_int intensity_int hue_base light_base   hue light hex    
#  <ord>  <ord>     <fct>            <dbl>         <int>    <dbl>      <dbl> <dbl> <dbl> <chr>  
# E      Good      E Good               0             2      0        0.714     0    71 #D89FA9
# F      Good      F Good               1             2      0.4      0.714   144    71 #81BA98
# G      Good      G Good               2             2      0.8      0.714   288    71 #BDA4D2

Plot it:

ggplot(df, aes(x, y, color = interaction)) +
  geom_count() +
  facet_wrap(~factor) +
  scale_color_manual(
    values = color_values$hex,
    labels = color_values$interaction
  ) +
  guides(color = guide_legend(override.aes = list(size = 5)))

enter image description here

yake84
  • 3,004
  • 2
  • 19
  • 35
  • Wow amazing, thanks! The only question I had is can this be adapted to lines? When I tried the legend lists a label for each point in the line (creating multiple lines for each "interaction" term") rather than one label for each interaction. – jtanman Jun 12 '19 at 00:40
  • Also I would still run into the issue with legend where each interaction would have a line rather than just one color per factor and then an intensity scale from 0 to 100 for intensity within that factor right? – jtanman Jun 12 '19 at 00:43
  • So I've been able to modify it for lines by removing the label in scale_color_manual and creating a new dataframe with the distinct colors (in case some colors overlap for example on (#000000). So I'm doing this now: colors <- color_values %>% distinct(interaction, hex) scale_color_manual( values = colors$hex ) – jtanman Jun 12 '19 at 00:53
  • You are correct. I do not believe every value of 0-100 needs to be plotted for your audience to understand what's going on. Can you go up in intervals of 10 `round(1:99, -1)` or even 20 `1:99 %/% 20 / 5`? I think your inital approach of color + alpha is going to let you get that level of detail in the gradient but, as you pointed out, the concern is the legend. I adapted my answer from another question I worked on. Perhaps it might help generate some ideas: https://stackoverflow.com/questions/56120815/how-to-factor-sub-group-by-category/56123985?noredirect=1#comment99206202_56123985 – yake84 Jun 12 '19 at 00:55