7

I recognize that this question is a close duplicate of this one, but the solution there no longer works (using method="last.qp"), so I'm asking it again.

The basic issue is that I'd like to use directlabels (or equivalent) to label smoothed means for each group (from stat_smooth()), rather than the actual data. The example below shows as close as I've gotten, but the labels aren't recognizing the grouping, or even the smoothed line. Instead, I'm getting the label at the last point. What I'd like is colour-coordinated text at the end of each stat_smooth(), rather than the legend on the right of the plot. This post provides an approach for labelling the last data point (the behaviour I'm seeing), but I'm looking for an approach to label automatically-generated summaries, if possible.

Here's an example:

library(ggplot2)
library(directlabels)

## Data
set.seed(10)
d <- data.frame(x=seq(1,100,1), y=rnorm(100, 3, 0.5))
d$z <- ifelse(d$y>3,1,0)

## Plot
p <- ggplot(d, aes(x=x, y=y, colour=as.factor(z))) +
  stat_smooth(inherit.aes=T, se=F, span=0.8, show.legend = T) +
  geom_line(colour="grey50") +
  scale_x_continuous(limits=c(0,110)) +
  geom_dl(label="text", method="maxvar.points", inherit.aes=T)
p

which makes this plot: enter image description here

phalteman
  • 3,442
  • 1
  • 29
  • 46

2 Answers2

7

A solution using ggrepel package based on this answer

library(tidyverse)
library(ggrepel)

set.seed(123456789)

d <- data.frame(x = seq(1, 100, 1), y = rnorm(100, 3, 0.5))
d$z <- ifelse(d$y > 3, 1, 0)

labelInfo <-
  split(d, d$z) %>%
  lapply(function(t) {
    data.frame(
      predAtMax = loess(y ~ x, span = 0.8, data = t) %>%
        predict(newdata = data.frame(x = max(t$x)))
      , max = max(t$x)
    )}) %>%
  bind_rows

labelInfo$label = levels(factor(d$z))
labelInfo

#>   predAtMax max label
#> 1  2.538433  99     0
#> 2  3.293859 100     1

ggplot(d, aes(x = x, y = y, color = factor(z))) + 
  geom_point(shape = 1) +
  geom_line(colour = "grey50") +
  stat_smooth(inherit.aes = TRUE, se = FALSE, span = 0.8, show.legend = TRUE) +
  geom_label_repel(data = labelInfo, 
                   aes(x = max, y = predAtMax, 
                       label = label, 
                       color = label), 
                   nudge_x = 5) +
  theme_classic()
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'

Created on 2018-06-11 by the reprex package (v0.2.0).

Tung
  • 26,371
  • 7
  • 91
  • 115
  • 1
    I was originally hoping for a solution that didn't involve creating another dataframe to hold the loess predictions, but it doesn't seem that's possible. One nice thing about this solution is that `geom_label_repel` seems to work across facet panels easily, which is helpful for my actual problem. Thanks! – phalteman Jun 13 '18 at 17:43
5

You need to tell geom_dl what you want to appear on you plot. The code below should simply address your needs;

p <- ggplot(d, aes(x=x, y=y, colour=as.factor(z))) +
  stat_smooth(inherit.aes=T, se=F, span=0.8, method = "loess", show.legend = F) +
  geom_line(colour="grey50") +
  scale_x_continuous(limits=c(0,110)) +
  geom_dl(label=as.factor(d$z), method="maxvar.points", inherit.aes=T)

If you want different text rather than 0 and 1 you just need to make it based on d$z and put that instead of as.factor(d$z).

enter image description here

In order to put the labels beside last point of geom_smooth rather than last datapoints, I could not find any of the method in geom_dl to do so, therefore, came up with a workaround:

p <- ggplot(d, aes(x=x, y=y, colour=as.factor(z))) +
  stat_smooth(inherit.aes=T, aes(label=as.factor(z)), se=F, 
              span=0.8, method = "loess", show.legend = F) +
  geom_line(colour="grey50") +
  scale_x_continuous(limits=c(0,110))


library(data.table)
smooth_dat <- setDT(ggplot_build(p)$data[[1]])
smooth_lab <- smooth_dat[smooth_dat[, .I[x == max(x)], by=group]$V1]


p + annotate("text", x = smooth_lab$x, y=smooth_lab$y, 
             label=smooth_lab$label,colour=smooth_lab$colour,
             hjust=-1)

enter image description here

M--
  • 25,431
  • 8
  • 61
  • 93
  • Thanks for the answer. That gets the text I want, but it doesn't actually map those labels to the smoothed means, just to the last 0 and 1 datapoints, which is what I'm trying to avoid. If you re-execute your code with `set.seed(10)` you'll get a figure that shows that the labels are not tied to the lines, but to the datapoints. Do you know a way to tie them to the smoothed lines instead? – phalteman Jun 12 '18 at 00:47
  • @phalteman I added a workaround. Maybe a solution for `geom_dl` is out there but I couldn't find it. – M-- Jun 12 '18 at 19:19
  • 1
    Thanks for the edits. I added a couple lines to add `smooth_lab$label` since it didn't seem to be created in the code you supplied. I think this is a good answer for this example, but I had trouble expanding it to my real problem, where I have a faceted plot - I couldn't `annotate` to make the aesthetics to work across multiple panels. @Tung 's answer did work across panels, so I've accepted their answer, which is essentially the same workaround you supplied. Thanks again for the help! – phalteman Jun 13 '18 at 17:40