0

I am trying to get most common words in a dictionary I created. I read about it an i saw that the answer is using sorted function and [:n] in order to ask for number of elements.

my problem is that my dictionary is a bit different to what I saw - it's like that:

[{"count":27,"stem":"obama","term":"obama"},{"count":20,"stem":"boehner","term":"boehner"},{"count":4,"stem":"tax","term":"tax"},
{"count":3,"stem":"daley","term":"daley"},{"count":3,"stem":"couldn","term":"couldn"},{"count":2,"stem":"trillion","term":"trillion"}]

so in this example obama is mention 27 time boehner 20 and tax 2 - so lets say I want to get top 5 most common words, how do I do that?

user3488862
  • 1,329
  • 2
  • 12
  • 16

1 Answers1

2
In [39]: L = [{"count":27,"stem":"obama","term":"obama"},{"count":20,"stem":"boehner","term":"boehner"},{"count":4,"stem":"tax","term":"tax"},
{"count":3,"stem":"daley","term":"daley"},{"count":3,"stem":"couldn","term":"couldn"},{"count":2,"stem":"trillion","term":"trillion"}]

In [40]: counts = collections.Counter(itertools.chain.from_iterable([d['term']]*d['count'] for d in L))

In [41]: counts.most_common(5)
Out[41]: [('obama', 27), ('boehner', 20), ('tax', 4), ('daley', 3), ('couldn', 3)]

Don't forget to import itertools, collections

inspectorG4dget
  • 110,290
  • 27
  • 149
  • 241