I have a dataframe of N
columns. Each element in the dataframe is in the range 0
, N-1
.
For example, my dataframce can be something like (N=3
):
A B C
0 0 2 0
1 1 0 1
2 2 2 0
3 2 0 0
4 0 0 0
I want to create a co-occurrence matrix (please correct me if there is a different standard name for that) of size N x N which each element ij contains the number of times that element i and j assume the same value.
A B C
A x 2 3
B 2 x 2
C 3 2 x
Where, for example, matrix[0, 1]
means that A and B assume the same value 2 times.
I don't care about the value on the diagonal.
What is the smartest way to do that?