I am trying to calculate the total number of unique interactions that exist between the categorical features in a dataset.
Assume a small dataframe:
Fruit Vegetable Animal
---------------------------------------------------
0 Apple Carrot Rabbit
1 Apple Lemon Fish
2 Banana Cucumber Cat
3 Orange Lemon Fish
4 Melon Lettuce Cat
5 Mango Lemon Fish
---------------------------------------------------
How do I calculate the total number of unique pairwise interactions between the features? The fruit column has 5 unique cats, the vegetable column has 4 unique cats and the animal column has 3 unique cats. So the sum of all possible combinations for all three columns if I am not mistaken is 5 x 4 x 3 = 60. However, I would like to calculate the number of possible pairwise combinations that exist in the given dataset.
So for example, Apple-Carrot
is one, Carrot-Rabbit
is another. Lemon-Fish
also counts as one, despite appearing three times in the dataset.