0

i am trying to do network analysis in igraph but having some issues with transforming the dataset I have into an edge list (with weights), given the differing amount of columns.

The data set looks as follows (much larger of course): First is the main operator id (main operator can also be partner and vice versa, so the Ids are staying the same in the adjacency) The challenge is that the amount of partners varies (from 0 to 40).

IdMain IdPartner1  IdPartner2  IdPartner3 IdPartner4 .....
1      4           3           2          NA
2      3           1          NA          NA
3      1           4           7          6
4      9           6           3          NA
.
.

my question is how to transform this into an edge list with weight which is undirected (just expressing interaction):

Id1 Id2 weight
1   2    2
1   3    2
1   4    1
2   3    1    
3   4    2
.   .

Does anyone have a tip what the best way to go is? Many thanks in advance!

julia_3010
  • 255
  • 1
  • 2
  • 11
  • can you try rephrasing your question to make it more clear what your data set is, and exactly how you want that to be translated into a graph? It is hard for me to see how the initial data set you provide would translate into either the adjacency matrix or edge list you provide. I can see that the adjacency matrix and edge list describe the same graph, I just don't see how the initial data can be translated into that graph. – SlowLoris Aug 09 '17 at 17:41
  • also, even though this isn't part of your question, it is not an arbitrary choice whether you use an adjacency matrix or edge list to describe your graph, so you should think about your situation and which one is better for you https://stackoverflow.com/questions/2218322/what-is-better-adjacency-lists-or-adjacency-matrices-for-graph-problems-in-c – SlowLoris Aug 09 '17 at 17:44
  • thanks @Slowloris, given the size of the dataset i think an edgelist is better. I have edited the question now and i hope there is a little more clarity. – julia_3010 Aug 09 '17 at 19:17

1 Answers1

1

This is a classic reshaping task. You can use the reshape2 package for this.

text <- "IdMain IdPartner1  IdPartner2  IdPartner3 IdPartner4
1      4           3           2          NA
2      3           NA          NA         NA
3      1           4           7          6
4      9           NA          NA         NA"

data <- read.delim(text = text, sep = "")

library(reshape2)
data_melt <- reshape2::melt(data, id.vars = "IdMain")
edgelist <- data_melt[!is.na(data_melt$value), c("IdMain", "value")]

head(edgelist, 4)
#   IdMain value
# 1      1     4
# 2      2     3
# 3      3     1
# 4      4     9
Taylor H
  • 436
  • 2
  • 8
  • thank a lot @TaylorH! this really works well. one thing i wasn't clear enough in my question (edited now) is the weight of the interactions (edgelist with associated values). is there a way to take them into account as well? sorry for the added question and thanks again – julia_3010 Aug 09 '17 at 19:16
  • @julia_3010 how do your weights look in your data? – Taylor H Aug 09 '17 at 20:48
  • the structure is as above, but there are interactions which are repeated or the 'main'/'partner' order is the other way round (e.g `IdMain 1` partners with `number 3`, in another project it's the same again or `number 3` is `IdMain` for this interaction if that makes sense. – julia_3010 Aug 09 '17 at 21:06
  • Sorry, you will need to be more clear. Can you use `dput` on some of your data to show an example? Or otherwise create an example dataframe? – Taylor H Aug 10 '17 at 00:04