2

How do I plot the 50 least frequent words?

Maybe I am thinking too complicated. Here's the way I get the words:

distr = nltk.FreqDist(word for word in items)
words = distr .keys()
seldomwords = words [:50]

How do I plot this now?

With the plot function of FreqDist I get all or only the x most frequent words.

I tried something like:

distr .plot(:50)

But that's syntactically incorrect.

alvas
  • 115,346
  • 109
  • 446
  • 738
  • Take a look at http://stackoverflow.com/questions/37427673/sorting-freqdist-in-nltk-with-get-vs-get/37429443#37429443 – alvas May 26 '16 at 16:49

1 Answers1

5

It's sort of strange but the simplest way is to

  • first you have to extract the least common items from the FreqDist
  • then recreate the least common items and feed it back into a new FreqDist object
  • use FreqDist.plot() using the new FreqDist.

[Code]:

>>> from nltk import FreqDist
>>> fd = FreqDist(list('aaabbbbbcccccdddddddd'))
>>> last_two = FreqDist(dict(fd.most_common()[-2:]))
>>> last_two.plot()

[out]:

enter image description here

[Code]:

>>> from nltk import FreqDist
>>> fd = FreqDist(list('aaabbbbbcccccdddddddd'))
>>> last_two = FreqDist(dict(fd.most_common()[-2:]))
>>> last_two.plot()
>>> last_three = FreqDist(dict(fd.most_common()[-3:]))
>>> last_three.plot()

[out]:

enter image description here

alvas
  • 115,346
  • 109
  • 446
  • 738