1

Here's a reproducible code of the type of plot that I have with ggplot:

set.seed(123)
library(ggplot2)
df = data.frame(
    "x_var" = factor(rep(letters[1:10], 100)),
    "y_var" = factor(rep(LETTERS[1:10], each=100)),
    "fill_var" = sample(c(rexp(200, rate = 100),rexp(600, rate = 10), rexp(100, rate = 1), rexp(100, rate = 0.01)))
)

fill_colours = c("chartreuse4", "chartreuse3", "darkolivegreen2",
                 "khaki", "gold", "darkorange", "firebrick", "darkred")
fill_values_quantiles = seq(from = 0.05,
                            to = 0.95,
                            length.out = length(fill_colours) - 2)

ggplot(data = df, aes(x = x_var, y = y_var, fill = fill_var)) +
    geom_tile() +
    scale_fill_gradientn(
        colours = fill_colours,
        values = c(0,
                             quantile(df$fill_var, fill_values_quantiles),
                             ceiling(max(df$fill_var)))
        )

This gives this figure:

tile plot with quantiles

How do I get the legend bar like this?

enter image description here

I did this plot with this colour gradient because using quantiles to highlight the differences in the tile plots fits what I want to show. This plot will be done many times, with different dataframes.

I want the legend like the example I put above so that we can have a better idea of what's going on for the green values. With words, I want the ticks that correspond to the values I have put in the argument values of scale_fill_gradientn, and I want them to be evenly spaced. I still want a continuous colorbar.

Another idea instead of using quantiles: I could also probably work with a transformation of some sort, e.g. a log transformation, but in that case I'd want the real values of fill_val printed in the legend text at the ticks, and not the transformed values. But I'd like the other possibility better.

tjebo
  • 21,977
  • 7
  • 58
  • 94
Léa
  • 13
  • 2

2 Answers2

1

I wonder if you want to log teh color scale? since you have very few large values:

ggplot(data = df, aes(x = x_var, y = y_var, fill = fill_var)) +
  geom_tile() +

  scale_fill_gradientn(colours = fill_colours,
                        trans="log10",
                        breaks=trans_breaks('log10', function(x) 10^x),
                        labels=trans_format('log10', function(x) round(10^x,2)))

enter image description here

yuliaUU
  • 1,581
  • 2
  • 12
  • 33
1

You're very close. Your question is basically "how to convert my fill vector into percentiles". For this you can use the ecdf function. Other comments in the code.

## slightly changed to include 0 and 1
fill_values_quantiles <- seq(0, 1, length.out = length(fill_colours))
## use this for a vector of your quantile breaks for the labels (!)
quants <- quantile(df$fill_var, fill_values_quantiles)
## convert every value in your fill to quantiles
df$ptile_var <- ecdf(df$fill_var)(df$fill_var)

## use the percentiles for fill
ggplot(data = df, aes(x = x_var, y = y_var, fill = ptile_var)) +
  geom_tile() +
  scale_fill_gradientn(
## use your vectors from above for breaks and labels 
    colours = fill_colours,
    breaks = fill_values_quantiles,
    labels = round(quants, 3)
  )

tjebo
  • 21,977
  • 7
  • 58
  • 94
  • 1
    That's exactly it, thank you so much! I just added `limits = c(0,1)` within the `scale_fill_gradientn` function so that the minimum value appeared on the legend. – Léa Apr 03 '23 at 14:47