38

Is there a way to put text along a density line, or for that matter, any path, in ggplot2? By that, I mean either once as a label, in this style of xkcd: 1835, 1950 (middle panel), 1392, or 2234 (middle panel). Alternatively, is there a way to have the line be repeating text, such as this xkcd #930 ? My apologies for all the xkcd, I'm not sure what these styles are called, and it's the only place I can think of that I've seen this before to differentiate areas in this way.

Note: I'm not talking about the hand-drawn xkcd style, nor putting flat labels at the top

I know I can place a straight/flat piece of text, such as via annotate or geom_text, but I'm curious about bending such text so it appears to be along the curve of the data.

I'm also curious if there is a name for this style of text-along-line?

Example ggplot2 graph using annotate(...):

ggplot graph with horizontal text

Above example graph modified with curved text in Inkscape:

modified version of first graph with text following curved paths


Edit: Here's the data for the first two trial runs in March and April, as requested:

df <- data.frame(
  monthly_run = c('March', 'March', 'March', 'March', 'March', 'March', 'March', 
                  'March', 'March', 'March', 'March', 'March', 'March', 'March', 
                  'April', 'April', 'April', 'April', 'April', 'April', 'April', 
                  'April', 'April', 'April', 'April', 'April', 'April', 'April'),
  duration    = c(36, 44, 45, 48, 50, 50, 51, 54, 55, 57, 60, 60, 60, 60, 30,
                  40, 44, 47, 47, 47, 53, 53, 54, 55, 56, 57, 69, 77)
  )

ggplot(df, aes(x = duration, group = monthly_run, color = monthly_run)) + 
  geom_density() + 
  theme_minimal()`
Allan Cameron
  • 147,086
  • 7
  • 49
  • 87
byteit101
  • 3,910
  • 2
  • 20
  • 29
  • 1
    [This nice answer for `base::plot`](https://stackoverflow.com/a/27639186/1851712) may get you going. – Henrik Nov 06 '21 at 20:34
  • 1
    Here is the solution for your alternative option: https://stackoverflow.com/questions/34830677/spiral-wrapped-text – Bloxx Nov 06 '21 at 20:45

2 Answers2

57

Great question. I have often thought about this. I don't know of any packages that allow it natively, but it's not terribly difficult to do it yourself, since geom_text accepts angle as an aesthetic mapping.

Say we have the following plot:

library(ggplot2)

df <- data.frame(y = sin(seq(0, pi, length.out = 100)),
                 x = seq(0, pi, length.out = 100))

p <- ggplot(df, aes(x, y)) + 
  geom_line() + 
  coord_equal() +
  theme_bw()

p

And the following label that we want to run along it:

label <- "PIRATES VS NINJAS"

We can split the label into characters:

label <- strsplit(label, "")[[1]]

Now comes the tricky part. We need to space the letters evenly along the path, which requires working out the x co-ordinates that achieve this. We need a couple of helper functions here:

next_x_along_sine <- function(x, d)
{
  y <- sin(x)
  uniroot(f = \(b) b^2 + (sin(x + b) - y)^2 - d^2, c(0, 2*pi))$root + x
}
  
x_along_sine <- function(x1, d, n)
{
  while(length(x1) < n) x1 <- c(x1, next_x_along_sine(x1[length(x1)], d))
  x1
}

These allow us to create a little data frame of letters, co-ordinates and angles to plot our letters:

df2 <- as.data.frame(approx(df$x, df$y,  x_along_sine(1, 1/13, length(label))))
df2$label <- label
df2$angle <- atan(cos(df2$x)) * 180/pi

And now we can plot with plain old geom_text:

p + geom_text(aes(y = y + 0.1, label = label, angle = angle), data = df2,
              vjust = 1, size = 4, fontface = "bold")

enter image description here

Or, if we want to replace part of the line with text:

df$col <- cut(df$x, c(-1, 0.95, 2.24, 5), c("black", "white", "#000000"))

ggplot(df, aes(x, y)) + 
  geom_line(aes(color = col, group = col)) + 
  geom_text(aes(label = label, angle = angle), data = df2,
            size = 4, fontface = "bold") +
  scale_color_identity() +
  coord_equal() +
  theme_bw()

enter image description here

or, with some theme tweaks:

enter image description here


Addendum

Realistically, I probably won't get round to writing a geom_textpath package, but I thought it would be useful to show the sort of approach that might work for labelling density curves as per the OP's example. It requires the following suite of functions:

#-----------------------------------------------------------------------
# Converts a (delta y) / (delta x) gradient to the equivalent
# angle a letter sitting on that line needs to be rotated by to
# sit perpendicular to it. Includes a multiplier term so that we
# can take account of the different scale of x and y variables
# when plotting, as well as the device's aspect ratio.

gradient_to_text_angle <- function(grad, mult = 1)
{
  angle <- atan(mult * grad) * 180 / pi
}

#-----------------------------------------------------------------------
# From a given set of x and y co-ordinates, determine the gradient along
# the path, and also the Euclidean distance along the path. It will also
# calculate the multiplier needed to correct for differences in the x and
# y scales as well as the current plotting device's aspect ratio

get_path_data <- function(x, y)
{
  grad <- diff(y)/diff(x)
  multiplier <- diff(range(x))/diff(range(y)) * dev.size()[2] / dev.size()[1]
  
  new_x <- (head(x, -1) + tail(x, -1)) / 2
  new_y <- (head(y, -1) + tail(y, -1)) / 2
  path_length <- cumsum(sqrt(diff(x)^2 + diff(multiplier * y / 1.5)^2))
  data.frame(x = new_x, y = new_y, gradient = grad, 
             angle = gradient_to_text_angle(grad, multiplier), 
             length = path_length)
}

#-----------------------------------------------------------------------
# From a given path data frame as provided by get_path_data, as well
# as the beginning and ending x co-ordinate, produces the appropriate
# x, y values and angles for letters placed along the path.

get_path_points <- function(path, x_start, x_end, letters)
{
  start_dist <- approx(x = path$x, y = path$length, xout = x_start)$y
  end_dist <- approx(x = path$x, y = path$length, xout = x_end)$y
  diff_dist <- end_dist - start_dist
  letterwidths <- cumsum(strwidth(letters))
  letterwidths <- letterwidths/sum(strwidth(letters))
  dist_points <- c(start_dist, letterwidths * diff_dist + start_dist)
  dist_points <- (head(dist_points, -1) + tail(dist_points, -1))/2
  x <- approx(x = path$length, y = path$x, xout = dist_points)$y
  y <- approx(x = path$length, y = path$y, xout = dist_points)$y
  grad <- approx(x = path$length, y = path$gradient, xout = dist_points)$y
  angle <- approx(x = path$length, y = path$angle, xout = dist_points)$y
  data.frame(x = x, y = y, gradient = grad, 
             angle = angle, length = dist_points)
}

#-----------------------------------------------------------------------
# This function combines the other functions to get the appropriate
# x, y positions and angles for a given string on a given path.

label_to_path <- function(label, path, x_start = head(path$x, 1), 
                          x_end = tail(path$x, 1)) 
{
  letters <- unlist(strsplit(label, "")[1])
  df <- get_path_points(path, x_start, x_end, letters)
  df$letter <- letters
  df
}

#-----------------------------------------------------------------------
# This simple helper function gets the necessary density paths from
# a given variable. It can be passed a grouping variable to get multiple
# density paths

get_densities <- function(var, groups)
{
  if(missing(groups)) values <- list(var)
  else values <- split(var, groups)
  lapply(values, function(x) { 
    d <- density(x)
    data.frame(x = d$x, y = d$y)})
}

#-----------------------------------------------------------------------
# This is the end-user function to get a data frame of letters spaced
# out neatly and angled correctly along the density curve of the given
# variable (with optional grouping)

density_labels <- function(var, groups, proportion = 0.25)
{
  d <- get_densities(var, groups)
  d <- lapply(d, function(x) get_path_data(x$x, x$y))
  labels <- unique(groups)
  x_starts <- lapply(d, function(x) x$x[round((length(x$x) * (1 - proportion))/2)])
  x_ends <- lapply(d, function(x) x$x[round((length(x$x) * (1 + proportion))/2)])
  do.call(rbind, lapply(seq_along(d), function(i) {
    df <- label_to_path(labels[i], d[[i]], x_starts[[i]], x_ends[[i]])
    df$group <- labels[i]
    df}))
}

With these functions defined, we can now do:

set.seed(100)

df <- data.frame(value = rpois(100, 3),
                 group = rep(paste("This is a very long label",
                                   "that will nicely demonstrate the ability",
                                   "of text to follow a density curve"), 100))

ggplot(df, aes(value)) + 
  geom_density(fill = "forestgreen", color = NA, alpha = 0.2) +
  geom_text(aes(x = x, y = y, label = letter, angle = angle), 
            data = density_labels(df$value, df$group, 0.8)) +
  theme_bw() 

enter image description here

Allan Cameron
  • 147,086
  • 7
  • 49
  • 87
  • @Henrik thanks. The problem is as much with spacing as it is angle, though both are important. I can see a way to generalize this to other curves, but that's probably more in the realm of writing a package than an SO answer! A `geom_textpath` might be a useful tool. I'd be surprised if no-one has implemented this already, but I couldn't find anything similar by Googling. – Allan Cameron Nov 06 '21 at 23:00
  • I was surprised it wasn't implemented either. I'm definitely interested in a generalized solution, as my density plot example isn't a sine wave. Good work though! – byteit101 Nov 07 '21 at 19:36
  • @byteit101 if you want to post a reproducible example I will attempt a more general solution – Allan Cameron Nov 07 '21 at 19:41
  • 1
    @byteit101 see my update for an approach that works on density curves. – Allan Cameron Nov 07 '21 at 22:54
  • if I remember when the bounty delay expires, I'll come back and post a bounty for this answer ... – Ben Bolker Nov 07 '21 at 23:57
  • 1
    very nice! @teunbrand maybe something for the ggh4x package... ;) – tjebo Nov 08 '21 at 11:38
  • 1
    @tjebo I actually wondered that myself, but I can see so many gotchas with trying to find a general solution here that I'm guessing it would be too much work to implement. Also, we would need to convert the above into a `grid::grob` implementation to get it to work as a geom. I'd be interested to hear teunbrand's thoughts though. – Allan Cameron Nov 08 '21 at 11:43
  • @AllanCameron I added two of the runs to my answer. The Addendum looks very excellent, I will have to play around with it this week. Looks very promising! – byteit101 Nov 08 '21 at 17:23
  • 2
    @tjebo and Allan Cameron yes I see potential in a textpath/pathtext grob (which is also why I was asking [this question](https://stackoverflow.com/questions/69935520/offsetting-a-polyline-in-one-direction), where once again Allan Cameron didn't disappoint!) If you have ideas about what a geom layer should be able to do, you can leave suggestions [here](https://github.com/teunbrand/ggh4x/issues/58). – teunbrand Nov 12 '21 at 15:29
  • 12
    @teunbrand I actually got ahead of myself and made a `geom_textpath` package based on the answer here. You can install it via `remotes::install_github("AllanCameron/geomtextpath")` (it passes CRAN checks) and have a look at the examples under `?geom_textpath` . It can do some pretty neat things already, though I only started it a couple of days ago. I'm happy for you to adopt it into ggh4x if you like - there's not really enough to justify its own package, though I did consider expanding it to cover some other use cases. – Allan Cameron Nov 12 '21 at 15:35
  • 2
    That's great, nice work! I've seen packages been erected with less functionality, so it is likely justified as it is :) – teunbrand Nov 12 '21 at 15:53
16

In the end, this question prompted Teun van den Brand (@teunbrand) and I to develop the geomtextpath package, which is now on CRAN.

So now the question could be answered much more directly and simply:

library(geomtextpath)

ggplot(df, aes(x = duration, color = monthly_run)) + 
  geom_textdensity(aes(label = monthly_run, hjust = monthly_run,
                       vjust = monthly_run), size = 6) +
  scale_hjust_manual(values = c(0.4, 0.55)) +
  scale_vjust_manual(values = c(1.1, -0.2)) +
  scale_y_continuous(limits = c(0, 0.06)) +
  theme_minimal() +
  theme(legend.position = "none")

enter image description here

Allan Cameron
  • 147,086
  • 7
  • 49
  • 87
  • 2
    Brilliant! Although I would have probably chosen a different name for the package, e.g. "ggtextpath" :) – tjebo Jan 25 '22 at 16:00
  • 2
    @tjebo thanks. The name came about via this question and it seemed appropriate at the start. It has become less so as its scope broadened, but by that time it felt too late to change it. Ah well! – Allan Cameron Jan 25 '22 at 16:18