0

I have a question about how I might try to rearrange my data and try to build a "network" visualization of interactions in R. Basically I have a list of meetings and their attendees organized as follows:

Meeting ID Attendee
1 John
1 Mark
1 Kevin
2 Kevin
2 Sam

I want to create a visualization that shows a network of people that any individual has spoken to. So, for example, if I choose Kevin, I'd want a central node to be Kevin with two connected nodes representing Sam, Mark, and John, since Kevin participated in a meeting with all of them. It'd also be cool to adjust the size of the nodes based on the number of interactions.

It'd also be useful if you could help re-arrange the data into the following shape, and then I can try to work something out from there.

Individual Contact Quantity of Interactions
Kevin John 1
Kevin Mark 1
Kevin Sam 1
SecretAgentMan
  • 2,856
  • 7
  • 21
  • 41

2 Answers2

0

There are several packages that help with visualizing networks (also called graphs). See the gR Task View on CRAN for details (https://cran.r-project.org/web/views/gR.html).

Here is one approach using the diagram package (which assumes that John and Mark should also be connected since they could have talked to each other at meeting 1).

library(diagram)

mydat <- data.frame(ID=rep(1:2, c(3,2)),
                    Attendee=c('John', 'Mark', 'Kevin', 'Kevin', 'Sam'))

people <- unique(mydat$Attendee)

mydat$personID <- match(mydat$Attendee, people)

M <- matrix(0, nrow=length(people), ncol=length(people))

# break data frame into meetings
mydat2 <- split(mydat, mydat$ID)

# update M for each meeting
for(df in mydat2) {
  combs <- combn(df$personID, 2)
  M[t(combs)] <- df$ID[1]
}


plotmat(M, name=people,
        curve=0, arr.type='none')
Greg Snow
  • 48,497
  • 6
  • 83
  • 110
0

For reshaping your data you could use dplyr and purrr from the tidyverse. Here is a chapter about visualising networks using ggraph.

For your fist step you can transform your data like this:

library(dplyr)
library(purrr)

data <- tibble(
  Attendee = c('John', 'Mark', 'Kevin', 'Kevin', 'Sam', 'John'),
  ID = c(1,1,1,2,2,2))

data %>% 
  arrange(ID, Attendee) %>%
  group_by(ID) %>%
  filter(n() > 1) %>%
  split(.$ID) %>%
  map(., 1) %>%
  map(~combn(.x, m = 2)) %>%
  map(~t(.x)) %>%
  map_dfr(as_tibble) %>%
  group_by(V1, V2) %>%
  summarise(
    N = n()) %>%
  ungroup()

Result:

# A tibble: 5 x 3
  V1    V2        N
  <chr> <chr> <int>
1 John  Kevin     2
2 John  Mark      1
3 John  Sam       1
4 Kevin Mark      1
5 Kevin Sam       1

This is adapted from and explained in this article by W.R. Chase.

P.S.: When posting a question relating to data and r it helps posting example data as described here.

randomchars42
  • 335
  • 1
  • 8