-1

I have to build a directed graph of a social network based on interactions with a business. My starting element is a two column table [user_id, [Friends]]. The entries for user_id come from subsetting a larger set on the basis of interaction with a specified business(if an interaction is detected, the user_id is included in this table). The entries for user_id are factors, and the entries for Friends are a list of factors, pulled directly from the database and include all friends per user_id.

example:

| user_id | Friends                |
|--------:|------------------------|
| Jake    | ['Laura','Bob','Mary'] |
| Laura   | ['Bob','John','Peter'] |
| Bob     | ['Jane','Fred','Mary'] | 

In order to determine my edges, I would like to cross reference each user_id with the friends of every other user_id. From the example: is Bob in Jake's or Laura's friends list? is Jake in Bob's or Laura's friends list? is Laura in Bob's or Jake's friends list?

Every time the question is answered positively, add an edge between users. This I am hoping to represent in an adjacency matrix. Our example would return something like this:

|       | Bob | Jake | Laura | Jane | Fred | Mary | John | Peter |
|------:|-----|------|-------|------|------|------|------|-------|
| Bob   |     |      |       |      |      |      |      |       |
| Jake  | 1   |      | 1     |      |      |      |      |       |
| Laura | 1   |      |       |      |      |      |      |       |
| Jane  |     |      |       |      |      |      |      |       |
| Fred  |     |      |       |      |      |      |      |       |
| Mary  |     |      |       |      |      |      |      |       |
| John  |     |      |       |      |      |      |      |       |
| Peter |     |      |       |      |      |      |      |       |

Finally I would like to build a graph based on this matrix

Thanks!

Edit for clarity and added example

home_wrecker
  • 365
  • 2
  • 11
  • What do you want help with? What have you done so far? See this http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example to create good questions. – Developer Nov 03 '15 at 19:47
  • Not sure what is unclear...the first line of my post: I have to build a directed graph of a social network based on interactions with a business. I then explain my starting variables. And on a separate paragraph: In order to determine my edges, I would like to cross reference each user_id with the friends of every user_id – home_wrecker Nov 03 '15 at 20:06
  • 1
    You usually show what you have done, coding and so on. And then it is easier to help. Questions you haven't tried to find an answer for (show your work!) is not helpful. – Developer Nov 03 '15 at 20:34

1 Answers1

0

I'm not sure I completely understand the question, but maybe you could get your graph from an edge list created by merging your user id and friend tables. For example:

set.seed(1)                                                                                        
n <- 10                                                                                            
uid <- ceiling(runif(10, max=5))                                                                   
fid <- letters[ceiling(runif(10, max=5))]                                                          
tab <- data.frame(user_id=uid, Friends=fid)                                                        
tab3 <- tab2 <- tab                                                                                
names(tab2) <- c('ego_id', 'Friends')                                                              
names(tab3) <- c('alter_id', 'Friends')                                                            
mg <- merge(tab2, tab3)                                                                            
test <- mg$ego_id == mg$alter_id                                                                   
mg <- mg[!test, ]                                                                                  
g <- igraph::graph.data.frame(mg[, c('ego_id', 'alter_id')], directed=TRUE)

Which turns this fake table, tab

    user_id Friends
 1        2       b
 2        2       a
 3        3       d
 4        5       b
 5        2       d
 6        5       c
 7        5       d
 8        4       e
 9        4       b
 10       1       d

into this igraph graph, g

 IGRAPH DN-- 5 18 --
 + attr: name (v/c)
 + edges (vertex names):
  [1] 2->5 2->4 5->2 5->4 4->2 4->5 3->1 3->5 3->2 1->3 1->5 1->2 5->3 5->1 5->2
 [16] 2->3 2->1 2->5

Now from the comments, it has become clear that question is how to get a graph from a factor and a list of factors using an adjacency matrix as an intermediate step. Here's one way to do that

!> user_id <- factor(c('Jake', 'Laura', 'Bob'))                                                       
 > Friends <- list(factor(c('Laura', 'Bob', 'Mary')),                                                 
 +                 factor(c('Bob', 'John', 'Peter')),                                                 
 +                 factor(c('Jane', 'Fred', 'Mary')))                                                 
 > all_nodes <- unique(c(levels(unlist(Friends)), levels(user_id)))                                   
 > A1 <- sapply(Friends, function(x) all_nodes %in% x)                                                
 > colnames(A1) <- as.character(user_id)                                                              
 > rownames(A1) <- as.character(all_nodes)                                                            
 > test <- !as.character(all_nodes) %in% as.character(user_id)                                        
 > extra_cols <- as.character(all_nodes[test])                                                        
 > A2 <- matrix(FALSE, nrow=nrow(A1), ncol=length(extra_cols))                                        
 > colnames(A2) <- extra_cols                                                                         
 > A <- cbind(A1, A2)                                                                                 
 > A <- A[rownames(A), rownames(A)]                                                                   
 > A <- t(A)                                                                                          
 > g <- igraph::graph_from_adjacency_matrix(A)                                                        
 > g                                                                                                  
 IGRAPH DN-- 8 9 --
 + attr: name (v/c)
 + edges (vertex names):
 [1] Bob  ->Mary  Bob  ->Fred  Bob  ->Jane  Laura->Bob   Laura->John
 [6] Laura->Peter Jake ->Bob   Jake ->Laura Jake ->Mary
e3bo
  • 1,663
  • 1
  • 14
  • 9
  • Hey! Thanks for the effort! I added an example to make things a little more clear. – home_wrecker Nov 03 '15 at 21:33
  • @home_wrecker, that is much clearer but if you want an example you can apply to your data it would probably help if you could specify more clearly the format of your data. Is user_id a factor and Friends a list of factors? You can determine these things using the class() function. – e3bo Nov 03 '15 at 22:36
  • yes and yes. user_id is a factor, and Friends is a list of factors – home_wrecker Nov 03 '15 at 22:57
  • @home_wrecker, there's now an example for that case. Probably it is more efficient to build the graph from an edge list but if your graph is small enough this approach might be OK. – e3bo Nov 03 '15 at 23:52
  • after further inspection, it seems all_nodes also contains a number of lists...is there a way to unravel those? – home_wrecker Nov 04 '15 at 01:33
  • figured it out...it works as intended using: all_nodes <- unique(c(levels(unlist(levels(unlist(Friends)))), levels(user_id))) – home_wrecker Nov 04 '15 at 01:40