-1

My original data

I want to convert the text data into a dataframe which will contain the 500 words like the below picture in which each sentence will contain the occurrence of that word in the particular sentence (Row of a dataframe.)

Final Output_data

I have performed text preprocessing and all with NLTK.

James Z
  • 12,209
  • 10
  • 24
  • 44
kounteyo
  • 11
  • 5
  • _Kindly help._ With what, what specifically is the issue? Please provide a [mcve], and see [ask], [help/on-topic]. – AMC Aug 24 '20 at 00:20
  • I need to create term docoment matrix of the dataframe like the above shown. – kounteyo Aug 24 '20 at 13:58

1 Answers1

0
from sklearn.feature_extraction.text import CountVectorizer
count_vect = CountVectorizer()
X_train_counts = count_vect.fit_transform(twenty_train.data)  

https://scikit-learn.org/stable/tutorial/text_analytics/working_with_text_data.html

barker
  • 1,005
  • 18
  • 36
  • This will only form a vector martrix. But I donot want that. I want to get the occurrence of each word in a sentence in the dataframe. – kounteyo Aug 24 '20 at 05:14