0

I have a list of networks of family names. Each network is composed of two different subgraphs (for instance, net_277_278 is composed of subgraphs sg_277 and sg_278). (I induced the subgraphs using a complicated mix of the gsub and induced_subgraph functions).

All the edges in the same subgraph have the form Johnson_278--Smith_278, indicating a link between two surnames belonging to the same subgraph. Links between the two subgraphs are given by the same surname (but have a different subscript). A link between the two subgraphs looks like this: Johnson_277--Johnson_278.

For each subgraph, I want to to compute some network measures (e.g., centrality), but only for the nodes that are directly connected to the other subgraph. (Note that I do not want to induce new subgraphs, since this would alter the measures). For instance, for sg_277 I would like to compute some measures only for Johnson_277, if that's the only node with a link to the sg_278.

As a small example, this is one of the networks:

net[10] #277_278 $11_277_278 IGRAPH UN-- 9 6 -- + attr: name (v/c) + edges (vertex names): [1] SANCHEZ_110277--SANCHEZ_110278 SANCHEZ_110277--PANTOJA_110277 SANCHEZ_110278--GALVAN_110278 [4] PEREZ_110278 --VEGA_110278 PEREZ_110278 --OLVERA_110278 PATIÑO_110278 --SERRANO_110278

Graphically, it looks like this

And this is one of the subgraphs:

sgraph1[[10]] #277
IGRAPH UN-- 2 1 -- 
+ attr: name (v/c)
+ edge (vertex names):
[1] SANCHEZ_110277--PANTOJA_110277

In this case, I would like to run the function for eigenvector centrality (evcent(sgraph1[[10]])$vector), only for the nodes linking the two subgraphs (in this case, just SANCHEZ_110277, which, as can be seen in the full network, has a link to SANCHEZ_110278.

Is there a way to do this? I have been trying stuff using regular expressions, neighbors and ego() functions but none of these seem to help.

  • Please provide a [small working example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) people can use to start tinkering. – Roman Luštrik Jul 11 '16 at 07:20
  • Can you post what you have tried with regular expressions? A proper regular expression filtering of the nodes should do the trick. – Ryan Haunfelder Jul 11 '16 at 14:34
  • If I did the following, I would be able to extract the numbers of the subscripts from the list of edges and then identify the nodes with a different subscript: `foo <- E(net[[5]]) gsub("^[A-Z]+_([0-9]+).*","\\1",foo) gsub(".*_([0-9]+)$","\\1",bar)` The problem is that ` E(net[[5]]) ` is not a vector but an igraph object (can't get rid of the header that says: ` 148/148 edges (vertex names)` – Pablo Balan Jul 11 '16 at 17:41
  • You could apply that gsub function to the edge list. Something like, `get.num=function(x){ gsub("^[A-Z]+_([0-9]+).*","\\1",x)} apply(get.edgelist(net[[5]]),2,get.num)` and then identify the rows that are not duplicated. – Ryan Haunfelder Jul 11 '16 at 19:23

0 Answers0