Python. Count co-occurrences in columns values of data frame

Asked Jan 20 '19 at 18:21

Active Jan 20 '19 at 18:34

Viewed 726 times

Here is my data frame:

df = pd.DataFrame({'c1': [1, 4, 7, 5, 6], 'c2': [2, 5, 1, 7, 8], 'c3': [3, 1, 2, 4, 6], 'c4': [3, 9, 5, 4, 8], 'c5': [1, 2, 3, 4, 5], 'c6': [2, 5, 1, 7, 8]})

Digits represent a code of product (they are not numbers). I'm looking for something like correlation matrix, to compare similarity of columns (count intersections). Could you please help writing a loop counting number of similar codes for all columns?

Sample output:

C1 with c2 ... times, c3... times, c4... times, c5... times, c6... times
C2 with c3... times, c4... times, c5... times, c6... times
C3 with c4... times, c5... times, c6... times    
And so on

P.S. I checked for duplicates, but was not able to find same problem.

edited Jan 20 '19 at 18:30

asked Jan 20 '19 at 18:21

Jerry

What is desired output? – Karn Kumar Jan 20 '19 at 18:24
Either mentioned above, or, ideally, as correlation matrix with values. Where value = number of intersections. – Jerry Jan 20 '19 at 18:25
Something like `df.T.dot(df)` – Karn Kumar Jan 20 '19 at 18:30
1

@pygo - no, because OP need `Digits represent a code of product (they are not numbers)` – jezrael Jan 20 '19 at 18:32
You can check [here](https://stackoverflow.com/questions/42814452/co-occurrence-matrix-from-list-of-words-in-python/42814963) or [here as well](https://stackoverflow.com/questions/20574257/constructing-a-co-occurrence-matrix-in-python-pandas) – Karn Kumar Jan 20 '19 at 18:33
@jezrael, you are good to turn on a answer then for sure :-) – Karn Kumar Jan 20 '19 at 18:35
@pygo - no experience with this typo of problems, but seems Rafael add answer in dupe answer. – jezrael Jan 20 '19 at 18:36
I got the same thread which i mentioned above :-) – Karn Kumar Jan 20 '19 at 18:38

Python. Count co-occurrences in columns values of data frame

0 Answers0