Connecting large points with lines and arrows in ggplot2

Question

Using ggplot, I would like to draw a segment, curve or path from one point to another, including an arrow. My issue is that I want to connect the line to the "edge" of the point, not the center, so that the arrow is visible regardless of the size of the point.

For example, the following code is fine:

df <- data.frame(x1=10, x2=5, y1=10, y2=5)

ggplot(df) + 
    geom_point(aes(x=x1, y=y1)) + 
    geom_point(aes(x=x2, y=y2)) + 
    geom_segment(aes(x=x1, y=y1, xend=x2, yend=y2), 
        arrow = arrow())

But if I make the points really large, the arrow gets obscured by the point:

ggplot(df) + 
    geom_point(aes(x=x1, y=y1)) + 
    geom_point(aes(x=x2, y=y2), size=20) + 
    geom_segment(aes(x=x1, y=y1, xend=x2, yend=y2), 
        arrow = arrow())

How can I adjust the line so that the tip of the arrow always meets the edge of the point?

I've tried adjusting the coordinates of the line terminus, but the size aesthetic is on a different scale than the coordinates themselves (discussion here), so it's difficult to know how to change the coordinates in a generalizable way. The most relevant similar question is this, but the answers don't solve my problem. Thanks!

Why not draw the arrow in a different colour so it is visible above the point? — Richard Telford, Aug 17 '16 at 19:40
That's certainly an option. With lots of arrows, labels on top of the points etc. it might not be very pretty though... — dmp, Aug 17 '16 at 19:52

renato vitolo · Answer 1 · 2016-08-18T00:43:10.227

Personally, I'd go for a manual solution:

library(ggplot2)

plot.arr <- function(df, pointsiz=2, pointsiz.scale.factor=100){

    ## calculate weights to adjust for aspect ratio
    norm2 <- function(v) sqrt(sum(v^2))
    w <- c(diff(range(df[ ,c("x1","x2")])), diff(range(df[ ,c("y1","y2")])))
    w <- w/norm2(w)

    ## use "elliptical" norm to account for different scales on x vs. y axes
    norm2w <- function(v) sqrt(sum((v/w)^2))

    ## compute normalized direction vectors, using "elliptical" norm
    direc <- do.call("rbind",lapply(1:nrow(df), function(i) {
        vec <- with(df[i, ], c(dx=x2-x1, dy=y2-y1))
        data.frame(as.list(vec/norm2w(vec)))
    }))

    ## "shift back" endpoints:
    ## translate endpoints towards startpoints by a fixed length;
    ## translation direction is given by the normalized vectors;
    ## translation length is proportional to the overall size of the plot
    ## along both x and y directions
    ## pointsiz.scale.factor can be decreased/increased for larger/smaller pointsizes

    epsil <- direc * diff(range(df)) / pointsiz.scale.factor
    df$xend2 <- df$x2 - epsil$dx
    df$yend2 <- df$y2 - epsil$dy

    g <- ggplot(df) + 
        geom_point(aes(x=x1, y=y1), size=pointsiz) + 
        geom_point(aes(x=x2, y=y2), size=pointsiz) + 
        geom_segment(aes(x=x1, y=y1, xend=xend2, yend=yend2), 
                     arrow = arrow())        
    print(g)
}
set.seed(124)
##n.arr <- 1
n.arr <- 3

df <- data.frame(x1=10+rnorm(n.arr,10,400),
                 x2=5 +rnorm(n.arr,1),
                 y1=10+rnorm(n.arr,0,5),
                 y2=5 +rnorm(n.arr,2))
plot.arr(df)

df <- data.frame(x1=10+rnorm(n.arr,1000,4000),
                 x2=5 +rnorm(n.arr,1),
                 y1=10+rnorm(n.arr,0,5),
                 y2=5 +rnorm(n.arr,2))
plot.arr(df)


df <- data.frame(x1=10+rnorm(n.arr,3,4),
                 x2=5 +rnorm(n.arr,1),
                 y1=10+rnorm(n.arr,0,5),
                 y2=5 +rnorm(n.arr,2))
plot.arr(df, pointsiz=4, pointsiz.scale.factor=50)

Regarding your (excellent) question in the comment below: ideally, to address that properly one would perform calculations embedded within a "native ggplot" setting, as opposed to the explicit "manual" procedure above. This might be possible, but it requires much deeper, Paul-Murrell-like knowledge, which I definitely do not possess. Specifically: after the third plot above I tried

ggsave("test1.pdf", width=6, height=6)
ggsave("test2.pdf", width=16, height=6)

Using a larger width has the side effect of stretching the distance between the arrow point and the endpoints, because the three arrows are more or less horizontally aligned; I see no other option than manually adjusting pointsiz.scale.factor after visual inspection of the pdf files. Alas, this is definitely not "native ggplot": it is brute-force, a-posteriori trial-and-error; however, it can be shown to converge in linear personal time, provided that the aspect ratio of the pdf is fixed in advance -- new aspect ratios require new trial-and-error estimation. Similar remarks hold when using larger pointsizes: to find a visually pleasing distance between the arrow points and the point edges, I see no other option than trial-and-error. Such are the limits of manual approaches...

Wow that's excellent. I was trying to do this with simpler geometry and it wasn't working. How can I modify this for a different aspect ratio (I want to print the graph to a pdf with different height/width)? — dmp, Aug 18 '16 at 00:31

Connecting large points with lines and arrows in ggplot2

1 Answers1