I have a set of categorical variables coded as one-hot format. Im trying to make something like a correlation matrix, but calculating the times every pair of variables is "on" together (meaning sum every case the two variables are 1) I know i can calculate that by just multiplying both vectors and then sum the total (as only the times when both are 1 will add to the sum) But i cant think of a way to make the final matrix. For example I have this dataset
A B C D E
1 1 0 1 0
0 1 0 0 1
0 0 1 1 1
0 0 1 0 1
0 0 0 0 1
i need a matrix like this (the diagonal values doesnt really matter)
A B C D E
A - 1 0 1 0
B 1 - 0 1 0
C 0 0 - 1 2
D 1 1 1 - 1
E 0 0 2 1 -
Notice for example that E-C is 2 because in 2 ocations both were On (1)