1

I have done the counter of most common words to keep only the 128 most common words in my list in order:

words = my_list
mcommon_words = [word for word, word_count in Counter(words).most_common(128)]
my_list = [x for x in my_list if x in mcommon_words]
my_list = OrderedDict.fromkeys(my_list)
my_list = list(my_list.keys())

But now I want to count the 128 less common words in the same way. A faster solution would help me a lot too

Eduardo Andrade
  • 111
  • 1
  • 1
  • 13
  • 1
    Possible duplicate of [Obtaining the least common element in array](https://stackoverflow.com/questions/4743035/obtaining-the-least-common-element-in-array) – Mr. T Mar 01 '18 at 20:45

2 Answers2

2

most_common returns the words and their counts as a list of tuples. Furthermore, if no argument is given, it returns all the words.

The fact that the method returns a list means that you can use slicing to get the first and last n elements.

Demo:

l = list("asadfabsdfasodfjoasdffsafdsa")
sorted_items = [w for w, _ in Counter(l).most_common()]

print(sorted_items[:2])  ## Print top 2 items
print(sorted_items[-2:]) ## Print last 2 items
nisemonoxide
  • 441
  • 2
  • 7
1

You might try the following:

from collections import Counter

def common_words(words, number_of_words, reverse=False):
    counter = Counter(words)
    return sorted(counter, key = counter.get, reverse=reverse)[:number_of_words]

Here we make sure that the Counter dictionary is sorted by its value. After the sort, we return the least most words. Here is a test example:

words = []
for i,num in enumerate('one two three four five six seven eight nine ten'.split()):
    words.extend([num]*(i+1))

print(common_words(words,5))

This example got the 5 least common words from your list of words.

Results:

['one', 'two', 'three', 'four', 'five']

We can also get the most common words:

print(common_words(words,5, reverse=True))

Results:

['ten', 'nine', 'eight', 'seven', 'six']
Mike Peder
  • 728
  • 4
  • 8