0

I have a list of variables occurring across different studies. Each variable is coded with 1 if it is present in the given study and with 0 if it not, with the table looking like this:

        var1 var2 var3
study1   1    0    0
study1   1    0    1
study1   0    0    0    etc.

My objective is to create a table with the number of times each variable occurs together with each of the other variables, i.e. something like this:

        var1 var2 var3
var1     -    2    4
var2     2    -    1
var3     4    0    -    etc.

How can I do this in R?

I have tried to look for a guide or a similar question, but come up empty.

Thank you in advance for your help!

Aden
  • 1
  • Welcome to stackoverflow. Please see here: – TarJae Mar 01 '21 at 22:21
  • Your input dataset indicates that var1 and var3 are both present and not present in study1. How is that possible? – Limey Mar 02 '21 at 00:46
  • I think this is what you are looking for - the cross product of a tabulation gives the co-occurrence matrix - https://stackoverflow.com/questions/19977596/how-do-i-calculate-the-co-occurrence-in-the-table – thelatemail Mar 02 '21 at 03:38

1 Answers1

0

The following code calculates how many times each variable occurs with each of the other variables and puts them in a symmetric matrix. The diagonal entries are -1.

df=data.frame(var1=c(1,1,0), var2=c(1,0,0), var3=c(0,1,1))
df

  var1 var2 var3
1    1    1    0
2    1    0    1
3    0    0    1

library(dplyr)
mat=diag(-1, ncol(df))
for (i in 1:(ncol(df)-1)) {
  for (j in (i+1):ncol(df)) {
      num=filter(df, all_of(df[,i])==1 & all_of(df[,j])==1) %>% nrow()
      mat[i,j]=num
      mat[j,i]=num
  }
}
mat

     [,1] [,2] [,3]
[1,]   -1    1    1
[2,]    1   -1    0
[3,]    1    0   -1
Vons
  • 3,277
  • 2
  • 16
  • 19