1

I made a co-occurrence matrix using sklearn CountVectorizer and saved it as a csv file. Let's say it looks something like that :

  Unnamed: 0  a  b  c  d
0          a  0  1  0  0
1          b  2  0  1  0
2          c  0  1  0  3
3          d  0  0  1  0

What would be the easiest way to plot a co-occurrence network with this dataframe serving as the co-occurrence matrix ?

  • Maybe check out [this answer](https://stackoverflow.com/questions/53444717/plotting-a-network-using-a-co-occurrence-matrix) which seems to be asking a similar question. You're on the right track with networkx. – Philip Ciunkiewicz Aug 19 '20 at 17:53
  • Maybe something like: `G=nx.from_pandas_adjacency(df)` followed by `nx.draw_networkx(G)`? Not sure about the weights though – ALollz Aug 19 '20 at 18:04

1 Answers1

1

As @ALollz has mentioned in the comments, you can use G=nx.from_pandas_adjacency(df) to create a graph from your pandas dataframe and then visualize it with pyvis.network as follows:

import pandas as pd
import numpy as np
import networkx as nx
from pyvis.network import Network

# creating a dummy adjacency matrix of shape 20x20 with random values of 0 to 3
adj_mat = np.random.randint(0, 3, size=(20, 20))
np.fill_diagonal(adj_mat, 0)  # setting the diagonal values as 0
df = pd.DataFrame(adj_mat)

# create a graph from your dataframe
G = nx.from_pandas_adjacency(df)

# visualize it with pyvis
N = Network(height='100%', width='100%', bgcolor='#222222', font_color='white')
N.barnes_hut()
for n in G.nodes:
    N.add_node(int(n))
for e in G.edges:
    N.add_edge(int(e[0]), int(e[1]))

N.write_html('./coocc-graph.html')

enter image description here

Azim Mazinani
  • 705
  • 6
  • 11