0

I'm trying to get the frequency distribution for a list if it's over a certain number.

Example:

import nltk
test_list=['aa', 'aa', 'bb', 'cc', 'dd', 'dd']
test_fd = nltk.FreqDist(test_list)

Returns:

FreqDist({'aa': 2, 'dd': 2, 'bb': 1, 'cc': 1})

Without a loop, I am looking for all the items greater than 1.

Using Python 3.8 and NLTK 3.5

TheSavageTeddy
  • 204
  • 1
  • 13
  • You would need to use a loop, even if you do come across a solution which 'doesn't use loop' technically would internally use a loop. Use something like [this](https://stackoverflow.com/a/40555781/8661686). – agupta Oct 29 '20 at 14:09
  • Does this answer your question? [Finding frequency distribution of a list of numbers in python](https://stackoverflow.com/questions/40553332/finding-frequency-distribution-of-a-list-of-numbers-in-python) – agupta Oct 29 '20 at 14:11

2 Answers2

0

Here is a possible solution:

test_fd = nltk.FreqDist({k: v for k, v in test_fd.items() if v > 1})
Riccardo Bucco
  • 13,980
  • 4
  • 22
  • 50
0

It can be done with filter and you can decide to have as output a dict or a list (of tuples):

test_fd = dict(filter(lambda x: x[1] > 1, nltk.FreqDist(test_list).items()))
Luca Massaron
  • 1,734
  • 18
  • 25