4

I'm trying to make a genealogy diagram of the history of EDA, coloring the edges and arrows by the Institution of the PhD degree. This sort of works, but the filled arrow heads in the diagram appear unfilled in the legend. Is there some way to get what I want?

Here's my MWE:

library(readxl)   # Read Excel Files
library(dplyr)    # A Grammar of Data Manipulation
library(here)     # A Simpler Way to Find Your Files
library(ggraph)   # An Implementation of Grammar of Graphics for Graphs and Networks
library(igraph)   # Network Analysis and Visualization

EDA_geneaology <- read_excel(here("EDA-geneaology.xlsx"))

EDA_gen <- EDA_geneaology %>%
  rename(parent = advisor, 
         child = student,
         Institution = institution) 

#' Clean up some links not to be shown
EDA_gen <- EDA_gen %>% 
  mutate(main = (child %in% c("John Tukey", "Harold Gulliksen")) ) %>% 
  filter( !(parent %in% c("Solomon Lefschetz", "James Angell")) ) %>% 
  filter( !(child %in% c("Clyde Coombs")))

EDA_graph <- graph_from_data_frame(EDA_gen[,c(1,3,2,4,6)])


ggraph(EDA_graph, layout="kk") + 
  geom_edge_link(aes(color=Institution, fill=Institution),
                 arrow = grid::arrow(type = "closed", 
                                     angle=15, 
                                     length = unit(0.15, "inches"))
                ) + 
  geom_node_point() +
  geom_node_text(aes(label = name), repel = TRUE) +
  ggtitle("Specimen of a Chart of Geneaology of EDA") + 
  theme_graph() +
  theme(legend.position = 'bottom') 

And this is my graph:

enter image description here

Edit

Here is the data: As a link: https://www.dropbox.com/s/oq3jwvg8bto93ln/EDA-geneaology.xlsx?dl=0

and as dput()

> dput(EDA_gen)
structure(list(parent = c("John Tukey", "John Tukey", "John Tukey", 
"John Tukey", "Samuel Wilks", "Frederick Mosteller", "Frederick Mosteller", 
"Frederick Mosteller", "Robert Abelson", "John Tukey", "Harold Gulliksen", 
"Harold Gulliksen", "John Tukey", "Arthur Dempster", "John Tukey", 
"John Hartigan", "Samuel Wilks", "Louis Leon Thurstone", "John Tukey", 
"Frederick Mosteller", "Louis Leon Thurstone", "Harold Gulliksen", 
"John Hartigan", "Leo Goodman", "Ledyard Tucker", "Andreas Buja", 
"Dianne Cook", "Peter Huber", "Arthur Dempster", "Frederick Foster", 
"Antony Unwin", "Antony Unwin", "John Hartigan", "Heike Hofmann"
), Institution = c("Princeton", "Princeton", "Princeton", "Princeton", 
"Princeton", "Harvard", "Harvard", "Harvard", "Yale", "Princeton", 
"Princeton", "Princeton", "Princeton", "Harvard", "Princeton", 
"Yale", "Princeton", "Chicago", "Princeton", "Harvard", "Chicago", 
"Princeton", "Yale", "Chicago", "Univ. Illinois", "Rutgers", 
"Iowa State", "ETH Zürich", "Harvard", "Trinity College", "Trinity College", 
"Augsberg", "Yale", "Iowa State"), child = c("Arthur Dempster", 
"Leo Goodman", "David Hoaglin", "Frederick Mosteller", "Frederick Mosteller", 
"Persi Diaconis", "Stephen Fienberg", "Stanley Wasserman", "Lee Wilkinson", 
"Robert Abelson", "Michael Friendly", "Howard Wainer", "Paul Velleman", 
"Richard Heiberger", "Karen Kafadar", "Jay Emerson", "Leo Goodman", 
"Harold Gulliksen", "John Hartigan", "Sanford Weisberg", "Ledyard Tucker", 
"James Ramsay", "William Eddy", "Shelby Haberman", "Peter Schönemann", 
"Dianne Cook", "Hadley Wickham", "Andreas Buja", "John Chambers", 
"Antony Unwin", "Graham Wills", "Heike Hofmann", "Heike Hofmann", 
"Hadley Wickham"), PhDyear = c(1956, 1950, 1971, 1946, 1946, 
1974, 1968, 1977, 1975, 1953, 1972, 1970, 1976, 1972, 1979, 2002, 
1950, NA, 1962, 1974, 1946, 1970, 1972, 1970, 1964, 1993, 2008, 
1980, 1966, 1982, 1992, 2000, 2000, 2008), MGD_id = c(15981, 
35023, 35266, 35033, 35033, 18747, 58815, 13739, 238317, 132728, 
72941, NA, 52468, 194419, 35276, 1380, 35023, 128919, 29486, 
13739, 194457, 171749, 35342, 59032, 220653, 18734, 145799, 18641, 
113988, 45024, 244314, 46503, 46503, 145799), main = c(FALSE, 
FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, 
FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, 
FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, 
FALSE, FALSE, FALSE, FALSE, FALSE, FALSE)), row.names = c(NA, 
-34L), class = c("tbl_df", "tbl", "data.frame"))
user101089
  • 3,756
  • 1
  • 26
  • 53

1 Answers1

4

In the package ggraph, the function geom_edge_link will send you to geom_edge_path, which is where you'd go to find the parameters for drawing the legend. I added the name (custom) and the parameter fill. For fill, you can leave the setting to color because you won't have a different color in the arrowhead than the rest of the arrow.

# from GeomEdgePath
draw_key_custom = function(data, params, size) {
  segmentsGrob(0.1, 0.5, 0.9, 0.5,
               gp = gpar(
                 col = alpha(data$edge_colour, data$edge_alpha),
                 fill = alpha(data$edge_colour, data$edge_alpha),  # <- I'm new!
                 lwd = data$edge_width * .pt,
                 lty = data$edge_linetype, lineend = 'butt'
               ),
               arrow = params$arrow
  )
}

Now you just have to add it to your graph.

ggraph(EDA_graph, layout="kk") + 
  geom_edge_link(aes(color = Institution),
                 arrow = grid::arrow(type = "closed", 
                                     angle=15, 
                                     length = unit(0.15, "inches")),
                 key_glyph = "custom") +                           # <- I'm new!
  geom_node_point() +
  geom_node_text(aes(label = name), repel = TRUE) +
  ggtitle("Specimen of a Chart of Geneaology of EDA") + 
  theme_graph() +
  theme(legend.position = 'bottom')

enter image description here

Kat
  • 15,669
  • 3
  • 18
  • 51