3

I am doing LDA topic modeling in Python and the folloing is my code for visualization:

import pyLDAvis.gensim
pyLDAvis.enable_notebook()
vis = pyLDAvis.gensim.prepare(lda_model, corpus, dictionary=lda_model.id2word)
vis

I am looking for a way to export the Intertopic Distance Map graph to PDF or at least plot it using matplotlib then save as pdf, any idea?

Samira Khorshidi
  • 963
  • 1
  • 9
  • 29

1 Answers1

2

You can export the model in JSON format and then use it with matplotlib

# Export results in JSON format

pyLDAvis.enable_notebook()
vis = pyLDAvis.gensim.prepare(lda_model, corpus, id2word)
vis
pyLDAvis.save_json(vis, '/results/lda.json')

# Read JSON file

import json

with open('/results/lda.json', 'r') as myfile:
    data=myfile.read()

json_data = json.loads(data)


# Plot with matplotlib

import matplotlib.pyplot as plt

x_max = max(json_data['mdsDat']['x']) + (max(json_data['mdsDat']['x']) - min(json_data['mdsDat']['x'])) 
y_max = max(json_data['mdsDat']['y']) + (max(json_data['mdsDat']['y']) - min(json_data['mdsDat']['y'])) 
x_min = min(json_data['mdsDat']['x']) - (max(json_data['mdsDat']['x']) - min(json_data['mdsDat']['x'])) 
y_min = min(json_data['mdsDat']['y']) - (max(json_data['mdsDat']['y']) - min(json_data['mdsDat']['y']))

plt.axis([x_min, x_max, y_min, y_max])

# Depending on the number of topics, you may need to tweak the paremeters (e.g. the size of circles be Freq/100 or Freq/200, etc)

for i in range(len(json_data['mdsDat']['x'])):
    circle = plt.Circle((json_data['mdsDat']['x'][i],json_data['mdsDat']['y'][i]), radius = json_data['mdsDat']['Freq'][i]/100)
    plt.gca().add_artist(circle)
    
plt.show()
fmarques
  • 391
  • 2
  • 5
  • 16