Here are two examples, one that works and is derived from the https://www.nltk.org/book/ch02.html
and another that does not. The first examples plots single words frequencies, here ['america', 'citizen']
. The second is a modified version (evidently incorrectly) that attempts to plot frequencies of the bigram ['america citizen']
. I would like to plot ngram frequencies such as for a bigram like ['america citizen']
.
Plot Example 1 Plot Example 2 - failed
import nltk
from nltk.book import *
import matplotlib.pyplot as plt
from nltk.corpus import inaugural
inaugural.fileids()
plt.ion() # turns interactive mode on
[fileid[:4] for fileid in inaugural.fileids()]
############- this works ####
cfd = nltk.ConditionalFreqDist(
(target, fileid[:4])
for fileid in inaugural.fileids()
for w in inaugural.words(fileid)
for target in ['america', 'citizen']
if w.lower().startswith(target))
ax = plt.axes()
cfd.plot()
############- this does not work ####
cfd = nltk.ConditionalFreqDist(
(target, fileid[:4])
for fileid in inaugural.fileids()
for w in inaugural.words(fileid)
for target in ['american citizen']
if w.lower().startswith(target))
ax = plt.axes()
cfd.plot()