I start off without including the data. The problem comes from using geom_smooth
with lots of data points (i.e. a large data set), so a minimal data example for illustration purposes seems difficult to use (I tried). But I can submit the data if requested.
I have scores on several variables and want to see trends in these scores across the age of respondents (cross-sectional data). Data are now in long format (so the original variables are all under the column 'name').
Like this:
age name value
<dbl> <chr> <dbl>
1 40 mo_clean 1
2 40 mo_groc 3
3 40 mo_trans 1
4 40 mo_digi 3
5 40 mo_emo 3
6 40 mo_activ 1
7 40 mo_supv 1
8 40 mo_doct 1
9 39 mo_clean 1
10 39 mo_groc 1
# … with 42,030 more rows
I want to:
- use
geom_smooth
andgeom_label
and - then switch to
ggrepel::geom_label_repel
to avoid overlapping labels
Getting labels to work with geom_smooth
turned out difficult, but I managed to do so with the code below:
library(ggplot2)
library(ggrepel)
df %>%
{
ggplot(df, aes(age, value, label = name, color = name)) +
geom_smooth(se = FALSE) +
guides(color = "none") +
geom_label(
data = group_by(., name) %>%
do(augment(loess(value ~ age, .))) %>%
filter(age == max(age)),
aes(age, .fitted), nudge_x = 2
)
} +
scale_x_continuous(breaks = seq(35, 65, by = 5)) +
xlab("Age") +
ylab(" ") +
theme(text = element_text(size = 14))
which gives this result:
Now, as anticipated, substituting geom_label
with geom_rabel_repel
does not work, due to the many data points. I get the following error message:
`geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'
Warning message:
ggrepel: 720 unlabeled data points (too many overlaps). Consider increasing max.overlaps
and all labels in the figure are dropped.
Increasing max.overlaps
is not the way to go, I assume. Just to illustrate the extreme case, with max.overlaps = Inf
:
[...]
geom_label_repel(
data = group_by(., name) %>%
do(augment(loess(value ~ age, .))) %>%
filter(age == max(age)),
aes(age, .fitted),
max.overlaps = Inf
)
[...]
Any hint? For instance where to find help (or even code suggestions)? Lots of web searches have not given me what I'm looking for: how to combine geom_smooth
with geom_label_repel
to get a nice plot with each smoothed line labelled, without labels overlapping.
—-
My question refers to geom_smooth
with lots of data points, the linked question (Plot labels at ends of lines) referred to geom_line
with few data points.
Note, however, that some of the answers to the other posts mention geom_smooth
and present code with geom_smooth
. So, I recommend looking at these answers, although they did not solve my problem.