1

I'm following up on a prior question I asked here: Calculating ratio of reciprocated ties for each node in igraph

The answers were very helpful, but I realized one of the calculations isn't coming out correctly. I'm trying to figure out the ratio of reciprocated edges to outdegree--in other words, what percentage of people I nominate as friends nominate me as a friend?

When students don't nominate friends (outdegree is 0), they're not included in my calculation of reciprocated ties. Since they can't have any reciprocated ties, I want their reciprocity to be calculated as 0. Their ratio of reciprocated ties/outdegree should also be 0.

Here's an example:

library(igraph)    

###Creating sample edgelist###
from<- c("A", "A", "A", "B", "B", "B", "C", "D", "D", "E")
to<- c("B", "C", "D", "A", "E", "D", "A", "B", "C", "E")
weight<- c(1,2,3,2,1,3,2,2,1,1)
g2<- as.matrix(cbind(from,to, weight))

###Converting edgelist to network###
g3=graph.edgelist(g2[,1:2])
E(g3)$weight=as.numeric(g2[,3])

###Removing self-loop###
g3<-simplify(g3, remove.loops = T)

Here, E's indegree is 1 and outdegree is 0. I create a self-loop for E so the indegree and outdegree vectors remain the same length, and then remove it.

Next, I see which nominations are reciprocated:

recip<-is.mutual(g3)
recip<-as.data.frame(recip)

Then I create an edgelist from g3, and add recip to the data frame:

###Creating edgelist and adding recipe###
edgelist<- get.data.frame(g3, what = "edges")
colnames(edgelist)<- c("from", "to", "weight")

edgelist<- cbind(edgelist, recip)
edgelist

> edgelist
  from to weight recip
1    A  B      1  TRUE
2    A  C      2  TRUE
3    A  D      3 FALSE
4    B  A      2  TRUE
5    B  D      3  TRUE
6    B  E      1 FALSE
7    C  A      2  TRUE
8    D  B      2  TRUE
9    D  C      1 FALSE

This is where the trouble begins. Since E isn't in from, it's also not in the objects I create below.

Next, I create a table with outdegree and add vertex names:

##Creating outdegree and adding vertex IDs##
outdegree<- as.data.frame(degree(g3, mode="out"))

ID<-V(g3)$name
outdegree<-cbind(ID, outdegree)
colnames(outdegree) <- c("ID","outdegree")
rownames(outdegree)<-NULL
outdegree

Outdegree comes out just as I want it:

 ID outdegree
1  A         3
2  B         3
3  C         1
4  D         2
5  E         0

When I calculate the number of reciprocated ties for each node, E isn't included, since I use the from column from edgelist I discussed above.

##Calculating number of reciprocated ties##
recip<-aggregate(recip~from,edgelist,sum)
colnames(recip)<- c("ID", "recip")
recip

> recip
  ID recip
1  A     2
2  B     2
3  C     1
4  D     1

So that's where the problem is. If try to create a table with the ratio of reciprocated ties to outdegree, E isn't included:

##Creating ratio table##
ratio<-merge(recip, outdegree, by= "ID")
ratio<-as.data.frame (recip$recip/ratio$outdegree)
ratio<- cbind(recip$ID, ratio)
colnames(ratio)<- c("ID", "ratio")
ratio

  ID     ratio
1  A 0.6666667
2  B 0.6666667
3  C 1.0000000
4  D 0.5000000

Ultimately, I want a row in ratio for E that equals 0. Since the ratio here would be 0/0 (0 reciprocated ties/0 outdegree), I'd probably get an NaN but I can convert that to 0 easily, so that would be fine.

I could work around this and export the data to Excel, run the calculations by hand, and keep it easy. But that won't help improve my coding skills, and I have a bunch of networks to run, so it's also pretty inefficient.

Any thoughts on how to automate this?

Thanks again for your help.

Gary DeYoung
  • 106
  • 7

1 Answers1

1

E is not showing up because E is not in the column from in the recip data frame! It is only in to.

You can aggregate on both columns and then merge.

r1 <- aggregate(recip~from,edgelist,sum)
colnames(r1) <- c("ID", "recip")
r2 <- aggregate(recip~to,edgelist,sum)
colnames(r2) <- c("ID", "recip")
recip <- merge(r1,r2, all = T) # all = T gives the union of the df's

Which gives:

  ID recip
1  A     2
2  B     2
3  C     1
4  D     1
5  E     0

Also, with piplining:

library(dplyr)

edgelist %>% 
    aggregate(recip~from,.,sum) %>% 
    rename(ID = from) %>% 
    merge(., edgelist %>% 
                 aggregate(recip~to,.,sum) %>% 
                 rename(ID = to), 
          all = T)
paqmo
  • 3,649
  • 1
  • 11
  • 21
  • This is really helpful--thanks. I knew the problem was due to E not being in "from", but wasn't sure how to get it there. – Gary DeYoung Aug 02 '18 at 21:22
  • Sorry--posted prematurely, and ran out of time to edit. Here's the rest: The piplining solution worked liked a charm. I had a question about the first one, though. It looks like r1 and r2 are both aggregating "to". Was this a typo? When I try to code r1 to aggregate "from" it again leaves out E, and when I merge r1 and r2, E is left out of the output. I'll definitely use the pipelining strategy, but can you fill me in on what's going wrong with the first one? Thanks again! – Gary DeYoung Aug 02 '18 at 21:36
  • @GaryDeYoung If this answer was helpful, you should at least upvote it. If it actually answers your question, please accept it as the answer. – G5W Aug 03 '18 at 12:21
  • @GaryDeYoung yes, that was a typo! Also, add `all = T` to `merge` to ensure that all observations from both `r1` and `r2` are kept. – paqmo Aug 03 '18 at 13:01
  • @paqmo Got it! This all makes sense now. Just accepted the answer, by the way. Apologies for the delay. – Gary DeYoung Aug 03 '18 at 15:45