Count of each unique element in a list

Question

Say I have a list of countries

l = ['India', 'China', 'China', 'Japan', 'USA', 'India', 'USA']

and then I have a list of unique countries

ul = ['India', 'China', 'Japan', 'USA']

I want to have a count of each unique country in the list in ascending order. So output should be as follows:

Japan 1
China 2
India 2
USA   2

Ajax1234 · Answer 1 · 2017-06-07T17:19:40.230

9

You can use Counter from collections:

from collections import Counter

l = ["India", "China", "China", "Japan", "USA", "India", "USA"]

new_vals = Counter(l).most_common()
new_vals = new_vals[::-1] #this sorts the list in ascending order

for a, b in new_vals:
    print a, b

edited Jun 07 '17 at 17:19

answered Jun 07 '17 at 17:14

Ajax1234

69,937
8
61
102

1

Is the output of `Counter(l).items()` guaranteed to return a a list sorted by count? I think you need to use `most_common()` – vroomfondel Jun 07 '17 at 17:18
1

Agreed. The `Counter` docs give this recipe for n least common: `c.most_common()[:-n-1:-1]` - which simplifies to the usual `[::-1]` if n is equal to the total number of items. – Peter DeGlopper Jun 07 '17 at 17:19
@Ajax1234 What if I run this on my data and I get the error - TypeError: unhashable type: 'dict' – ComplexData Jun 07 '17 at 17:29
Did you use exactly the data that is hard coded in this example, i.e "l" ? – Ajax1234 Jun 07 '17 at 17:30
1

That exception indicates that your list contains at least one dict - counters require that the elements you're counting be hashable, and dicts are not. – Peter DeGlopper Jun 07 '17 at 17:35
@PeterDeGlopper I guess you are correct. What is the solution to this? – ComplexData Jun 07 '17 at 19:16
Could you post your data with the dictionary that you are using? – Ajax1234 Jun 07 '17 at 19:22

score 3 · Answer 2 · answered Jun 07 '17 at 18:03

If you don't want to use a Counter you can count yourself (you already know the unique elements because you have ul) using a dictionary:

l = ['India', 'China', 'China', 'Japan', 'USA', 'India', 'USA'] 
ul = ['India', 'China', 'Japan', 'USA']

cnts = dict.fromkeys(ul, 0)  # initialize with 0

# count them
for item in l:
    cnts[item] += 1

# print them in ascending order
for name, cnt in sorted(cnts.items(), key=lambda x: x[1]):  # sort by the count in ascending order
    print(name, cnt)   
    # or in case you need the correct formatting (right padding for the name):
    # print('{:<5}'.format(name), cnt)

which prints:

Japan 1
China 2
India 2
USA   2

score 1 · Accepted Answer · answered Jun 07 '17 at 17:29

1

If you want to sort depending on the ul list, you can use list comprehension like:

l = ['India', 'China', 'China', 'Japan', 'USA', 'India', 'USA']
ul = ['India', 'China', 'Japan', 'USA']
result = sorted([(x, l.count(x)) for x in ul], key=lambda y: y[1])
for elem in result:
    print '{} {}'.format(elem[0], elem[1])

output:

Japan 1
India 2
China 2
USA 2

And if you want to sort by alphabet after sorting by count, you can change result to the following:

result = sorted(sorted([(x, l.count(x)) for x in ul]), key=lambda y: y[1])

output:

Japan 1
China 2
India 2
USA 2

answered Jun 07 '17 at 17:29

Mohd

5,523
7
19
30

The list comprehension using `count` is significantly slower than the `Counter` approach - order n^2 compared to order n just to count. See this answer for profiling: https://stackoverflow.com/a/23909767/2337736 – Peter DeGlopper Jun 07 '17 at 17:30
This approach just incase he has a pre-defined list of items (as mentioned in the example) that he wants to search for, not all of the items – Mohd Jun 07 '17 at 17:35
Even then I think you'd want to profile something like `ul_set = frozenset(ul); counts = Counter(country for country in countries if country in ul_set)` - you really want to avoid running `count` multiple times on the same list. I mean, it's fine for short lists, but you might as well use the faster tools. – Peter DeGlopper Jun 07 '17 at 17:44

Count of each unique element in a list

3 Answers3

Linked