3

I have two sets of latitude & longitude variables for a large number of rows in my data frame (~100,000). I am trying to make a plot that connects those two sets of coordinates (i.e, ~100,000 lines that go from latitude1,longitude1 to latitude2,longitude2), using geom_segment, with a very low alpha to make the lines transparent because there are so many lines.

I would like to emphasize the starting points and end points of those lines, and I reckoned the best way to do that would would be to have a colour gradient from start to end (let's say green to red).

Is it possible to draw a geom_segment line with a colour gradient? If not, do you know another way to emphasize start vs end with so many lines?

(I realize that it could end up looking messy because there are so many lines, but I suspect that many of them go in the same direction..)

Here is some example data of 5 rows (but in reality I have ~100,000, so it should be somewhat computationally efficient):

 example.df <- as.data.frame(matrix(c(1,435500,387500,320000,197000,
                                      2,510500,197500,513000,164000,
                                      3,164500,40500,431000,385000,
                                      4,318500,176500,316000,172000,
                                      5,331500,188500,472000,168000),
                                      nrow=5, ncol=5, byrow = TRUE))
  colnames(example.df) <- c("ID","longitude.1","latitude.1",
                            "longitude.2","latitude.2")

 library(ggforce)
 ggplot(example.df, aes(longitude.1, latitude.1))+
 geom_link(aes(x=longitude.1, y=latitude.1,
               xend=longitude.2, yend=latitude.2, 
               alpha=0.5), col="black")+
 coord_equal()

This produces these five lines: enter image description here

I would like these lines to start as blue at their first longitute-latitude coordinate point and end as red at the second longitute-latitude coordinate point.

zx8754
  • 52,746
  • 12
  • 114
  • 209
Abdel
  • 5,826
  • 12
  • 56
  • 77
  • 1
    `ggforce` has geoms that do this: https://ggforce.data-imaginist.com/reference/geom_link.html – camille Apr 18 '19 at 19:21
  • Thank you @camille! I figured out how geom_link can make the line go more transparent, but I can't seem to figure out how to make it change colours from the start to the end of the line... any ideas on how to achieve that? Many thanks again! – Abdel Apr 18 '19 at 22:04
  • Maybe if you add a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). Without seeing your data, I don't know how it would be any different from the color gradient example in the vignette I posted – camille Apr 18 '19 at 22:20
  • My apologies! I have now produced an example of five lines. Many thanks again for your help! – Abdel Apr 19 '19 at 19:34
  • 1
    @Abdel: You can always emphasize start and end by plotting arcs instead of lines (by adding the `arrow` option inside `geom_segment`). Would that help? – symbolrush Apr 23 '19 at 13:40
  • Thanks you for the suggestion @symbolrush, but I'm afraid that would not work well with the many lines I have in the actual data (~100,000). I think transparent lines wit changing colours between start and end would work best with so many lines... – Abdel Apr 23 '19 at 13:41
  • Related: [Colour geom_segment in ggplot2 according to segment length](https://stackoverflow.com/questions/48058055/colour-geom-segment-in-ggplot2-according-to-segment-length) – Henrik Apr 23 '19 at 15:35

1 Answers1

5

The ggforce approach seems like the best approach to what you are asking. Your code was almost what you were looking for, but I think you may have overlooked the colour = stat(index) statement inside the mapping. I assume that the index is a statistic that geom_link() calculates under the hood to interpolate the colours.

ggplot(example.df, aes(longitude.1, latitude.1))+
  geom_link(aes(x = longitude.1, y = latitude.1,
                xend = longitude.2, yend = latitude.2, 
                colour = stat(index)), lineend = "round") +
  scale_colour_gradient(low = "red", high = "green") +
  coord_equal()

enter image description here

A word of warning though, seeing as you intend to plot many lines; if you use an alpha = for geom_link() you can clearly see segmentation of the lines:

ggplot(example.df, aes(longitude.1, latitude.1))+
  geom_link(aes(x = longitude.1, y = latitude.1,
                xend = longitude.2, yend = latitude.2, 
                colour = stat(index)), lineend = "round", size = 10, alpha = 0.1) +
  scale_colour_gradient(low = "red", high = "green") +
  coord_equal()

enter image description here

Alternatively, you can use arrowheads to indicate end positions as follows:

ggplot(example.df, aes(longitude.1, latitude.1)) +
  geom_segment(aes(xend = longitude.2, yend = latitude.2),
               arrow = arrow()) +
  coord_equal()

enter image description here

A potential downside is that very short segments may be overemphasized with the arrows.

Hope this helps!

teunbrand
  • 33,645
  • 4
  • 37
  • 63