Hi :) I am really new to Python and NLP and now trying to go through the NLTK book from O'Reilly. I'm currently at a dead set with the task concerning plotting and tabulating with Conditional Frequency Distribution. The task is the following: "find out which days of the week are most newsworthy, and which are most romantic. Define a variable called days containing a list of days of the week, i.e. ['Monday', ...]. Now tabulate the counts for these words using cfd.tabulate(samples=days). Now try the same thing using plot in place of tabulate. You may control the output order of days with the help of an extra parameter: samples=['Monday', ...]."
This is my code:
import nltk
from nltk.corpus import brown
days = ['Sunday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday']
genre_day = [(genre, day)
for genre in ['news', 'romance']
for day in days]
cfd = nltk.ConditionalFreqDist(genre_day)
tabulated = cfd.tabulate(conditions=['news', 'romance'],
sample=days, cumulative=True)
What I have as an outcome is this:
Could please someone explain to me why I have these data instead of counting how much each word is used per genre in the corpus? I will be very greatful for any help