How to do a co-occurrence matrix from multiple data frames in R

Question

my first language isn't English so I apologize in advance for mistakes I could do. I'm newbie in R but you will notice that anyway.

I'm trying to solve the problem of having a co-occurence matrix. I have several dataframes and I am interested in 3 variables : idT, numname and numstim. This is the unique dataframe that contains the merged data :

z=rbind(df1,df2,df3,df4,df5,df6,df7,df8,df9,df10,df11,df12,df13,df14,
 df15,df16,df17,df18,df19,df20,df21,df22,df23,df24,df25,df26,df27,df28,df29,df30,df31,df32)
write.csv(z, file = ".../listz.csv")

Then I extracted the 3 variables with :

#Extract columns 3 & 6 from all the files within the list
z1 = z[,c(3,6)]

#Create a new variable 'numname' to convert name groups into numeric groups, 
#then obtain levels with facNum
z1$numname <- as.numeric(z1$namegroup)
colnames(z1) <- c("namegroup", "idT", "numname")
facNum <- factor(z1$numname)
write.csv(z1, file = "...D:/z1.csv")

And data look like :

           namegroup   idT   numname
1    GLISSEVIBREVITE   1       6
2          CINETIQUE   1       3
3 VIBRATIONS_LEGERES   1      20
4             DIFFUS   1       5
5            LIQUIDE   1       8
6        PICOTEMENTS   1      10

How to read the table : each idT is classified in a group (namegroup) and then this group is converted in a numeric variable (numname).

# Specify z1 as a data frame to make next operations
z1 = as.data.frame(z1, idT = z1$numstim, numgroup = z1$numname)
tab1 <- table(z1)
write.csv(tab1, file = ".../tab1test.csv")
out1 <- data.matrix(tab1 %*% t(tab1))
write.csv(out1, file = ".../bmtest.csv")

But the bmtest matrix doesn't look like counting pairs of idT, because only 22 users have participated and there are 32 idT, but some the numbers are much higher :

    1   2   3   4   5   6   7   8   9   10  11  12  13  14  15  16  
1   24  10  7   7   11  7   7   8   10  8   11  8   6   11  11  12  
2   10  32  27  7   5   4   7   4   4   4   5   3   2   6   6   14  
3   7   27  40  0   3   1   0   2   0   0   2   2   1   2   0   15  
4   7   7   0   30  7   14  15  9   15  13  13  7   5   12  13  5   
5   11  5   3   7   24  7   9   20  12  13  10  19  14  20  12  7

I wanna have a matrix which shows the results of a count of idT paired together. The matrix has to look like :

    1   2   3   4   5   6   7   8   9   10  11  12  13  14  15  16  
1   15  3   2   2   3   3   2   1   2   1   3   3   1   3   3   5   
2   3   15  9   2   0   1   2   0   0   0   0   0   0   0   1   3   
3   2   9   15  0   2   1   0   2   0   0   1   1   1   2   0   2   
4   2   2   0   15  1   6   5   1   7   5   6   2   0   1   3   2   
5   3   0   2   1   15  1   2   12  4   5   3   13  9   11  3   2

In other words, I want to see which idT have been paired together. I've looked at this topic but didn't find a way to solve my problem.

Also, I tried :

library(igraph)
library(tnet)
idT_numname <- cbind(z1$idT, z1$numname)
igraph <- graph.data.frame(idT_numname)

item_item <- projecting_tm(net = idT_numname, method="sum")
item_item <- tnet_igraph(item_item,type="weighted one-mode tnet")
itemmat <- get.adjacency(item_item,attr="weight")
itemmat  #8x8 martrix of items to items

But I get error message and I don't know how to get over the "duplicated entries in the edgelist", because it seems necessary to me to have duplicated entries in order to do a co-occurrence matrix :

> idT_numname <- cbind(z1$idT, z1$numname)
> item_item <- projecting_tm(idT_numname, method="sum")
Error in as.tnet(net, type = "binary two-mode tnet") : 
  There are duplicated entries in the edgelist

> item_item <- as.tnet(net = idT_numname, type ="binary two-mode tnet", method="sum")
Error in as.tnet(net = idT_numname, type = "binary two-mode tnet", method = "sum") : 
  unused argument (method = "sum")

> item_item <- as.tnet(net = idT_numname, type ="binary two-mode tnet")
Error in as.tnet(net = idT_numname, type = "binary two-mode tnet") : 
  There are duplicated entries in the edgelist

Your help is greatly appreciated. I like to do data analysis and I want to learn more and more everyday !

Thank you

How to do a co-occurrence matrix from multiple data frames in R

0 Answers0