Following my previous question I computed a code that creates a DTM. I would then need to make some calculations among columns and rows of my DTM. However, Python blocks when computing the last lineand is really impossible to run the code (the whole pc blocks). How to make the process smoothier?
Here is the code I am running (of course, (texts) is extremely larger)
texts=['text1', 'text4', 'text2', 'text3']
(each text has already been stemmed and removed punctuation)
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.feature_extraction.text import CountVectorizer
import itertools
merged = list(itertools.chain.from_iterable(texts))
vect = CountVectorizer(min_df=0., max_df=1.0)
X = vect.fit_transform(texts)
df = pd.DataFrame(X.toarray().transpose(), index = vect.get_feature_names())