2

I am doing topic modeling with tweets on Python. I am working on two time periods. I want to extracts topics with Spacy's textacy training the model on the corpus of both the time periods. Then, I want to analyse the weight of the topics on the tweets of only one period. I know how to do it on the whole corpus:

topic_weight_serie = pd.Series(model.topic_weights(doc_topic_matrix))
fig = plt.figure()
ax1 = fig.add_subplot(111)
bars = ax1.bar(range(len(topic_weight_serie)),topic_weight_serie, color='c', edgecolor='black')

I am quite sure the solution is very simple but I could not find it. Any idea? An option might be to save the model and use it on the part of tweets I want to analyse. If this is the way, how do I do it? Thank you very much!

s12345
  • 21
  • 1
  • What do you mean by "analyse the weight of the topics"? Do you want to color things differently depending on their period? – polm23 May 02 '22 at 02:54
  • I want have an histogram that represents the distribution of topics in a specific part of my corpus. I have fours lists of tweets (2 users, 2 periods) and I want to have the topic weights of 1 user in 2 specific time frame. Both an histogram and bare numbers would be perfectly good! Thanks! – s12345 May 03 '22 at 08:11

0 Answers0