You could show a square heatmap showing for each pair of source how many times they appear together. (On the diagonal, then, you have the total number of times that data appeared. And that heatmap is symmetrical)
cols=df.columns[:-1] # Ignoring `count`. You haven't said what it is and what to do with it
M=df[cols].values # Numpy array of values (just me being more comfortable with numpy. There are certainly direct ways in pandas)
matrix=(M[:,:,None] & M[:,None,:]).sum(axis=0)
coOccDf = pd.DataFrame(matrix, index=cols, columns=cols)
sns.heatmap(coOccDf, annot=True)

Edit
To take the count into account, the way OCa did (also upvoted :D), we can do that, without for loops
cols=df.columns[:-1] # Ignoring `count`. You haven't said what it is and what to do with it
counts=df['count'].values
M=df[cols].values # Numpy array of values (just me being more comfortable with numpy. There are certainly direct ways in pandas)
matrix=((M[:,:,None] & M[:,None,:])*counts[:,None,None]).sum(axis=0)
coOccDf = pd.DataFrame(matrix, index=cols, columns=cols)
sns.heatmap(coOccDf, annot=True)
This requires some explanation.
M
are the 2d array (shape (3,4) in the example) of the dataframe value, but for the count column.
1st axis is case number (well rows of dataframe. I am not sure what rows represent exactly here), and 2nd axis are the sources.
So M[:,:,None]
is a 3d array (shape (3,4,1) in the example). 1st axis, case number, 2nd axis source #1, and 3rd axis source #2. With source #2 being a void axis (for broadcasting later)
Likewise M[:,None,:]
is a 3d array (shape (3,1,4) in the example). 1st axis=case number, 2nd axis source #1, a void axis for broadcasting, and 3rd axis=source #2.
So, it is just M
. But with different arrangements (when printing M
, M[:,None,:]
or M[:,:,None]
all that change are some extra [
or ]
in the printing)
So, any operation between those 2, would create a broadcasting: axis of size 1 are virtually expanded as if they were of the same size as the corresponding axis in the other operand, repeating the value along it.
This is the way we use in numpy to write nested for loops, without actually writting the for loops (not for aesthetical reason, of course, but because it is way faster to trick numpy in doing the for loops, in C, than to write them, in python)
Just one example for broadcasting: np.array([1,2,3])[:,None]+np.array([10,20,30])[None,:]
. np.array([1,2,3])[:,None]
is a 3x1 array [[1],[2],[3]]
. np.array([10,20,30])[None,:]
is a 1×3 array: [[10,20,30]]
. So addition is as if we were adding [[1,1,1],[2,2,2],[3,3,3]]
and [[10,20,30],[10,20,30],[10,20,30]]
: data is repeated along singleton axis. With result [[11,21,31],[12,22,32],[13,23,33]]
. Exactly as if I'd written for i in range(3): for j in range(3): res[i,j]=A[i]+B[j]
. Search "numpy broadcasting" for better explanations than this one. But this is what I do with my M
. Except that there are 3 axis, because I need 3 nested for loops: for i rows: for j in sources: for k in sources
, to count the number of common cases between source j and source k, for all combination of j and k.
So here M[:,:,None] & M[:,None,:]
is a 3d array, for each case i, and for all pair of sources j and k, a value (M[:,:,None] & M[:,None,:])[i,j,k]
, true iff in case i, both source j and k are present.
In my first version (M[:,:,None] & M[:,None,:]).sum(axis=0)
is therefore a 2d array, telling for all pairs of sources j and k, the number of cases having both source j and k (we summed along axis 0, that is along case axis the True/False aka 1/0 values).
In the second version, to take into account the count
, before summing, I multiply each case by a weight. But to be able to perform a multiplication between a 3d array (3,4,4), and the 3 values, I need another broadcast, [:,None,None]
, meaning that each of the 3 values, for each case, are virtually repeated 4x4 times.
In other words, is it as if I had written: for i in cases: for j in sources: for k in sources: res[i,j]+=(M[i,j]&M[i,k])*count[i]
I don't include another screenshot, because in that example, all count being 2, it would be the exact same as before, with all values multiplied by 2 (as we can see in OCa's).
It is not exactly the one-liner OCa'd called for. But that is just because I didn't want to obfuscate the computation. All the intermediary variable could be replaced by their values to produce a one-liner. The important point is: no for loops.