0

I'm creating some network variables for a research project. The code is from the igraph-package and runs smoothly, but checking the results for centrality degree and density some of the outcomes are higher than 1, what is impossible. Might there be something wrong in the code? Or is there an error in the package?

This is the code:

WINDOW <- 3
YEARS = c(1975:2016)
Acquirer_inventor_edges <- read_dta("~/Research projects/M&A - R&D 
structure/Data/test file.dta")


finaldata <- data.frame(
firmid = numeric(),
year   = numeric(),
dens   = numeric(),
centr_c = numeric(),

stringsAsFactors = FALSE
)

for (j in unique(Acquirer_inventor_edges$a_COMP_gvkey)) {
print(j)
y <- subset.data.frame(Acquirer_inventor_edges, a_COMP_gvkey == j) 

for (f in YEARS) {
x <- subset.data.frame(y, appyear>=f-WINDOW & appyear<f) 
firm <- graph_from_data_frame(x, directed=TRUE, vertices=NULL) #make the 
firm, 3year window subset a network
density <- edge_density(firm, loops=TRUE)

centrality_cent = centralize(degree(firm),normalize=FALSE)/((vcount(firm)-1)*(vcount(firm)-2))


firm_level_data <- data.frame(
  firmid = j,
  year   = f,
  dens   = density,
  centr_c = centrality_cent,

  stringsAsFactors = FALSE
)

finaldata <- rbind(finaldata, firm_level_data)
} 
} 
Emut
  • 319
  • 4
  • 10
  • Can you provide some example data that creates the issue? Otherwise it's hard to say, but for example `edge_density` can be greater than 1 if have duplicate edges in your data – Esther Jun 22 '18 at 04:15
  • Hi Esther, sorry for the delay. I have created a testdatafile: https://drive.google.com/open?id=1Xe1nYPS3RMZa0sBXNFATN82l6-MTKrBs If you run the code on it, you will see that centralility of the network returns values of greater than 1 in some cases – Charlotte Jacobs Jun 24 '18 at 17:34
  • other possible command to calculate centrality: centrality_cent <- centr_degree(firm, mode=c("all"), loops = TRUE, normalized = TRUE)$centralization but this also returns higher than 1 – Charlotte Jacobs Jun 24 '18 at 17:40
  • Thanks for supplying your data. All of the results greater than 1 are happening because you have duplicate edges in your data. You can either remove them, or you'll have to decide explicitly how you want those cases to be handled. – Esther Jun 24 '18 at 18:45
  • Thank you for your quick reply Esther! Maybe using weights for the edges may bring a solution as well. Thanks! – Charlotte Jacobs Jun 24 '18 at 19:11
  • That's definitely an option, there are some ways to do that explained here [R igraph convert parallel edges to weight attribute](https://stackoverflow.com/questions/12998456/r-igraph-convert-parallel-edges-to-weight-attribute) – Esther Jun 24 '18 at 19:17

0 Answers0