1

Since my last question is running into dead ends, guess I will try it this way:

How do I label ONLY the top of the barplot for each condition? It needs to label both the count and the intensity (i.e. x and y-axis intersection). And to make clear, not label as in place a little dot, label as in print the information exactly as it corresponds to the count and intensity.

I feel like I have googled every possible combination of words to get this answer and still cannot get it to work. Proof? I searched mode functions here, here, and graphical approaches here and here.

My plot looks like this (with my own data), without the labels of the peaks:enter image description here

With the code to produce it:

#demo data:
set.seed(1234)

library(tidyverse)
library(fs)
n = 100000
silence = factor(c("sil1", "sil2", "sil3", "sil4", "sil5"))
treat = factor(c("con", "uos", "uos+wnt5a", "wnt5a"))
df <- expand.grid(silence = silence, treat = treat)
df <- data.frame(
  silence = rep(df$silence, each = 1000),
  treat = rep(df$treat, each = 1000)
)
df$intensity <- rgamma(nrow(df), shape = match(df$silence, unique(df$silence)),
                       scale = match(df$treat, unique(df$treat)))

p <-
  df %>%
  ggplot() +
  aes(x = intensity, fill = treat) +
  geom_histogram(bins = 100L, alpha = 0.75, position = "identity") +
  scale_fill_viridis_d(option = "plasma") +
  theme_minimal() +
  labs(x = "Intensity", y = "No. of intensities", title = "F1 Silencing", fill = "Treatment:") +
  facet_wrap(vars(silence))

So ideally I would have a little dot on the highest point of within treatment condition and each pane that labels 1) the count corresponding to that intensity. Example: red dot with the corresponding count and intensity at the peak of the peach (UOS+wnt5a) graph.

Walker
  • 63
  • 7

1 Answers1

3

You can request the layer_data() from a plot and use that to make new layers. Example below. I had to tweak your dummy data generation to be somewhat more realistic in terms of finding peaks and such (also, it didn't run). Your description was a bit confusion as on the one hand you wanted to label these points or put a point at that spot. I chose the label in example below.

set.seed(1234)

library(tidyverse)
#> Warning: package 'ggplot2' was built under R version 4.0.3
#> Warning: package 'tibble' was built under R version 4.0.3
#> Warning: package 'tidyr' was built under R version 4.0.3
#> Warning: package 'readr' was built under R version 4.0.3
#> Warning: package 'forcats' was built under R version 4.0.3
library(fs)
#> Warning: package 'fs' was built under R version 4.0.3
n = 100000
silence = factor(c("sil1", "sil2", "sil3", "sil4", "sil5"))
treat = factor(c("con", "uos", "uos+wnt5a", "wnt5a"))
df <- expand.grid(silence = silence, treat = treat)
df <- data.frame(
  silence = rep(df$silence, each = 1000),
  treat = rep(df$treat, each = 1000)
)
df$intensity <- rgamma(nrow(df), shape = match(df$silence, unique(df$silence)),
                       scale = match(df$treat, unique(df$treat)))


g <- ggplot(df) +
  aes(x = intensity, fill = treat) +
  geom_histogram(bins = 100L, alpha = 0.75, position = "identity") +
  scale_fill_viridis_d(option = "plasma") +
  theme_minimal() +
  labs(x = "Intensity", y = "No. of intensities", title = "F1 Silencing", fill = "Treatment:") +
  facet_wrap(vars(silence))

ld <- layer_data(g) %>%
  group_by(group, PANEL) %>%
  filter(count == max(count)) %>%
  mutate(silence = unique(df$silence)[PANEL])

g + ggrepel::geom_text_repel(
  data = ld,
  aes(x, y, label = paste0(count, " at ", scales::number(x))),
  inherit.aes = FALSE, min.segment.length = 0,
  nudge_y = 20
)

teunbrand
  • 33,645
  • 4
  • 37
  • 63
  • Amended the data as well - and I assume to improve the visibility, we just need to play around with the xlim's? – Walker Feb 23 '21 at 18:55
  • 1
    Yes, but you might also adjust the y-limits on a per-panel basis by setting `facet_wrap(..., scales = "free_y")`. – teunbrand Feb 23 '21 at 18:59
  • Do think this would be amenable to wrapping as a function? I am definitely going to be re-using this for subsequent analyses and was curious – Walker Feb 23 '21 at 19:23
  • 1
    It might probably be best to write it up as a stat function. The main inefficiency now is that the plot has to be build twice. However that requires some non-trivial meddling with ggplot2 internals. – teunbrand Feb 23 '21 at 19:27