285

Given an unordered list of values like

a = [5, 1, 2, 2, 4, 3, 1, 2, 3, 1, 1, 5, 2]

How can I get the frequency of each value that appears in the list, like so?

# `a` has 4 instances of `1`, 4 of `2`, 2 of `3`, 1 of `4,` 2 of `5`
b = [4, 4, 2, 1, 2] # expected output
SuperStormer
  • 4,997
  • 5
  • 25
  • 35
Bruce
  • 33,927
  • 76
  • 174
  • 262
  • Does this answer your question? [How do I count the occurrences of a list item?](https://stackoverflow.com/questions/2600191/how-do-i-count-the-occurrences-of-a-list-item) – Alireza75 Jul 28 '22 at 04:08
  • 1
    @Alireza How does it answer this question? This linked question is about counting a ***single***, specific item from a list. This question asks to get the count of all elements in a list – Tomerikoo Jul 28 '22 at 07:33
  • @Tomerikoo see the 'user52028778' answer and just use Counter.values() – Alireza75 Jul 28 '22 at 07:43

33 Answers33

637

In Python 2.7 (or newer), you can use collections.Counter:

>>> import collections
>>> a = [5, 1, 2, 2, 4, 3, 1, 2, 3, 1, 1, 5, 2]
>>> counter = collections.Counter(a)
>>> counter
Counter({1: 4, 2: 4, 5: 2, 3: 2, 4: 1})
>>> counter.values()
dict_values([2, 4, 4, 1, 2])
>>> counter.keys()
dict_keys([5, 1, 2, 4, 3])
>>> counter.most_common(3)
[(1, 4), (2, 4), (5, 2)]
>>> dict(counter)
{5: 2, 1: 4, 2: 4, 4: 1, 3: 2}
>>> # Get the counts in order matching the original specification,
>>> # by iterating over keys in sorted order
>>> [counter[x] for x in sorted(counter.keys())]
[4, 4, 2, 1, 2]

If you are using Python 2.6 or older, you can download an implementation here.

Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
  • 1
    @unutbu: What if I have three lists, a,b,c for which a and b remain the same, but c changes? How to count the the value of c for which a and c are same? – Srivatsan Jun 29 '14 at 12:31
  • @Srivatsan: I don't understand the situation. Please post a new question where you can elaborate. – unutbu Jun 29 '14 at 13:27
  • 2
    Is there a way to extract the dictionary {1:4, 2:4, 3:2, 5:2, 4:1} from the counter object ? – Pavan Mar 22 '15 at 00:38
  • 9
    @Pavan: `collections.Counter` is a subclass of `dict`. You can use it in the same way you would a normal dict. If you really want a dict, however, you could convert it to a dict using `dict(counter)`. – unutbu Mar 22 '15 at 00:46
  • Is there a way to count values if the list is a set of co-ordinates? Say a = [(0,0),(0,1),(0,2),(1,0),(0,1)...] I need to get the frequency in place, preferably in another list – user Sep 16 '18 at 07:45
172

If the list is sorted, you can use groupby from the itertools standard library (if it isn't, you can just sort it first, although this takes O(n lg n) time):

from itertools import groupby

a = [5, 1, 2, 2, 4, 3, 1, 2, 3, 1, 1, 5, 2]
[len(list(group)) for key, group in groupby(sorted(a))]

Output:

[4, 4, 2, 1, 2]
Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
Nadia Alramli
  • 111,714
  • 37
  • 173
  • 152
  • nice, using `groupby`. I wonder about its efficiency vs. the dict approach, though – Eli Bendersky Jan 29 '10 at 12:20
  • @Eli, yeah I'm not sure about its efficiency. But it doesn't hurt to have a variety of solutions. – Nadia Alramli Jan 29 '10 at 12:28
  • 36
    The python groupby creates new groups when the value it sees changes. In this case 1,1,1,2,1,1,1] would return [3,1,3]. If you expected [6,1] then just be sure to sort the data before using groupby. – Evan Jan 30 '10 at 22:41
  • I wonder if there's a way to skip the conversion to a list in `len(list(group))`. – Cristian Ciupitu Mar 22 '10 at 12:17
  • 4
    @CristianCiupitu: `sum(1 for _ in group)`. – Martijn Pieters Jul 29 '16 at 07:25
  • @Nadia Alramli Could you please specify the Python version this works or if there is a difference between Python versions someone should know. – buhtz Jul 29 '16 at 08:22
  • 8
    This is not a solution. The output doesn't tell what was counted. – buhtz Jul 29 '16 at 13:58
  • 12
    `[(key, len(list(group))) for key, group in groupby(a)]` or `{key: len(list(group)) for key, group in groupby(a)}` @buhtz – Eric Pauley Jun 16 '17 at 18:47
  • @EricPauley If this is n answer, please transform it to an answer. your comment is unreadable. – buhtz Jun 17 '17 at 20:42
  • @buhtz you just add the key to the output. Or use the other answers. It's not hard to see that I posted two lines which can substitute for a line in the answer. – Eric Pauley Jun 17 '17 at 20:47
  • Can this method be generalized to count list pairs? eg `a = [[1,'x'],[1,'y'],[1,'x'],[2,'y']]`? – alancalvitti Jan 04 '19 at 14:53
117

Python 2.7+ introduces Dictionary Comprehension. Building the dictionary from the list will get you the count as well as get rid of duplicates.

>>> a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
>>> d = {x:a.count(x) for x in a}
>>> d
{1: 4, 2: 4, 3: 2, 4: 1, 5: 2}
>>> a, b = d.keys(), d.values()
>>> a
[1, 2, 3, 4, 5]
>>> b
[4, 4, 2, 1, 2]
Amjith
  • 22,626
  • 14
  • 43
  • 38
  • This works really well with lists of strings as opposed to integers like the original question asked. – Glen Selle Jan 14 '16 at 05:29
  • 18
    It's faster using a set: `{x:a.count(x) for x in set(a)}` – stenci Feb 17 '16 at 17:55
  • 61
    **This is hugely inefficient**. `a.count()` does a *full traverse* for each element in `a`, making this a O(N^2) quadradic approach. `collections.Counter()` is **much more efficient** because it counts in linear time (O(N)). In numbers, that means this approach will execute 1 million steps for a list of length 1000, vs. just 1000 steps with `Counter()`, 10^12 steps where only 10^6 are needed by Counter for a million items in a list, etc. – Martijn Pieters Jul 29 '16 at 07:29
  • 7
    @stenci: sure, but the horror of using `a.count()` completely dwarfs the efficiency of having used a set there. – Martijn Pieters Oct 07 '16 at 23:22
  • 2
    @MartijnPieters one more reason to use it fewer times :) – stenci Oct 08 '16 at 01:14
  • Is it really so hard to do it efficiently? `d = {k: 0 for k in a}; for i in a: d[i]+=1;` – DylanYoung Jan 13 '20 at 20:56
  • 1
    @DylanYoung, that's what `collections.Counter` does but better. – Nzbuu Dec 04 '20 at 10:34
  • Yep; not sure your point. As for performance, that depends: https://stackoverflow.com/questions/27801945/surprising-results-with-python-timeit-counter-vs-defaultdict-vs-dict/27802189#27802189 – DylanYoung Dec 04 '20 at 19:44
  • 1
    In 3.6 and above, the keys will appear in `d` in order of first appearance in the list, so we can get this neatly ordered result by starting with sorted data (as shown). In previous versions, the key order is **not specified**. Also, sorting the list **does not help** with performance. Therefore, it makes more sense to start with the unordered input, and sort the keys/values afterwards if desired. However, this approach is still inefficient. – Karl Knechtel Jul 28 '22 at 03:55
51

Count the number of appearances manually by iterating through the list and counting them up, using a collections.defaultdict to track what has been seen so far:

from collections import defaultdict

appearances = defaultdict(int)

for curr in a:
    appearances[curr] += 1
Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
Idan K
  • 20,443
  • 10
  • 63
  • 83
  • 1
    +1 for collections.defaultdict. Also, in python 3.x, look up collections.Counter. It is the same as collections.defaultdict(int). – hughdbrown Jan 29 '10 at 13:54
  • 4
    @hughdbrown, actually `Counter` can use multiple numeric types including `float` or `Decimal`, not just `int`. – Cristian Ciupitu Jul 29 '16 at 08:43
  • `collections.Counter` does much more, and is a much more specialized tool, than `collections.defaultdict` with a numeric value type. It has extra convenience functions, and conceptually models the idea that the values represent counts rather than just being arbitrary numbers. – Karl Knechtel Jul 28 '22 at 04:02
36

In Python 2.7+, you could use collections.Counter to count items

>>> a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
>>>
>>> from collections import Counter
>>> c=Counter(a)
>>>
>>> c.values()
[4, 4, 2, 1, 2]
>>>
>>> c.keys()
[1, 2, 3, 4, 5]
YOU
  • 120,166
  • 34
  • 186
  • 219
29

Counting the frequency of elements is probably best done with a dictionary:

b = {}
for item in a:
    b[item] = b.get(item, 0) + 1

To remove the duplicates, use a set:

a = list(set(a))
Idan K
  • 20,443
  • 10
  • 63
  • 83
lindelof
  • 34,556
  • 31
  • 99
  • 140
  • 3
    @phkahler: Mine would only a tiny bit better than this. It's hardly worth my posting a separate answer when this can be improved with a small change. The point of SO is to get to the *best* answers. I could simply edit this, but I prefer to allow the original author a chance to make their own improvements. – S.Lott Jan 29 '10 at 16:58
  • 1
    @S.Lott The code is much cleaner without having to import `defaultdict`. – Brian Jul 28 '18 at 23:04
  • Why not preinitialize b: `b = {k:0 for k in a}`? – DylanYoung Jan 13 '20 at 21:00
  • 1
    @DylanYoung, because then you have to scan the list twice. And there's unlikely to be any benefit in Python: but check this for yourself. – Nzbuu Dec 04 '20 at 10:37
  • The benefit is clean code :) Could use a `defaultdict` too of course, then you don't have to iterate through a – DylanYoung Dec 04 '20 at 19:42
21

You can do this:

import numpy as np
a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
np.unique(a, return_counts=True)

Output:

(array([1, 2, 3, 4, 5]), array([4, 4, 2, 1, 2], dtype=int64))

The first array is values, and the second array is the number of elements with these values.

So If you want to get just array with the numbers you should use this:

np.unique(a, return_counts=True)[1]
Evgenii Pavlov
  • 211
  • 2
  • 4
20

Here's another succint alternative using itertools.groupby which also works for unordered input:

from itertools import groupby

items = [5, 1, 1, 2, 2, 1, 1, 2, 2, 3, 4, 3, 5]

results = {value: len(list(freq)) for value, freq in groupby(sorted(items))}

results

format: {value: num_of_occurencies}
{1: 4, 2: 4, 3: 2, 4: 1, 5: 2}
rbento
  • 9,919
  • 3
  • 61
  • 61
8

I would simply use scipy.stats.itemfreq in the following manner:

from scipy.stats import itemfreq

a = [1,1,1,1,2,2,2,2,3,3,4,5,5]

freq = itemfreq(a)

a = freq[:,0]
b = freq[:,1]

you may check the documentation here: http://docs.scipy.org/doc/scipy-0.16.0/reference/generated/scipy.stats.itemfreq.html

8
from collections import Counter
a=["E","D","C","G","B","A","B","F","D","D","C","A","G","A","C","B","F","C","B"]

counter=Counter(a)

kk=[list(counter.keys()),list(counter.values())]

pd.DataFrame(np.array(kk).T, columns=['Letter','Count'])
sheldonzy
  • 5,505
  • 9
  • 48
  • 86
  • 1
    While this code snippet may be the solution, including an explanation really helps to improve the quality of your post. Remember that you are answering the question for readers in the future, and those people might not know the reasons for your code suggestion – Rahul Gupta Dec 28 '17 at 05:16
  • Yes will do that Rahul Gupta – Anirban Lahiri Dec 28 '17 at 20:00
6
seta = set(a)
b = [a.count(el) for el in seta]
a = list(seta) #Only if you really want it.
lprsd
  • 84,407
  • 47
  • 135
  • 168
  • 4
    using lists `count` is ridiculously expensive and uncalled for in this scenario. – Idan K Jan 29 '10 at 12:20
  • @IdanK why count is expensive? – Kritika Rajain Sep 07 '18 at 11:13
  • @KritikaRajain For each unique element in the list you iterate over the whole list to generate a count (quadratic in the number of unique elements in the list). Instead, you can iterate over the list once and count up the number of each unique element (linear in the size of the list). If your list has only one unique element, the result will be the same. Moreover, this approach requires an additional intermediate set. – DylanYoung Jan 13 '20 at 21:10
5

Suppose we have a list:

fruits = ['banana', 'banana', 'apple', 'banana']

We can find out how many of each fruit we have in the list like so:

import numpy as np    
(unique, counts) = np.unique(fruits, return_counts=True)
{x:y for x,y in zip(unique, counts)}

Result:

{'banana': 3, 'apple': 1}
Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
jobima
  • 5,790
  • 1
  • 20
  • 18
4

This answer is more explicit

a = [1,1,1,1,2,2,2,2,3,3,3,4,4]

d = {}
for item in a:
    if item in d:
        d[item] = d.get(item)+1
    else:
        d[item] = 1

for k,v in d.items():
    print(str(k)+':'+str(v))

# output
#1:4
#2:4
#3:3
#4:2

#remove dups
d = set(a)
print(d)
#{1, 2, 3, 4}
Corey Richey
  • 57
  • 1
  • 1
  • Good work, simple solution to implement occurrence count in dictionary. – Abdul Salam Apr 23 '21 at 14:28
  • There is no need to use `d.get(item)` after checking `if item in d:` – both will check the exact same thing. Either use `d[item] = d[item]+1` inside the `if`, or remove the `if` and use the single case of `d[item] = d.get(item, 0) + 1`. – MisterMiyagi Jul 30 '22 at 12:58
2

For your first question, iterate the list and use a dictionary to keep track of an elements existsence.

For your second question, just use the set operator.

t3rse
  • 10,024
  • 11
  • 57
  • 84
2
def frequencyDistribution(data):
    return {i: data.count(i) for i in data}   

print frequencyDistribution([1,2,3,4])

...

 {1: 1, 2: 1, 3: 1, 4: 1}   # originalNumber: count
user2422819
  • 177
  • 13
2

I am quite late, but this will also work, and will help others:

a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
freq_list = []
a_l = list(set(a))

for x in a_l:
    freq_list.append(a.count(x))


print 'Freq',freq_list
print 'number',a_l

will produce this..

Freq  [4, 4, 2, 1, 2]
number[1, 2, 3, 4, 5]
jax
  • 3,927
  • 7
  • 41
  • 70
2
a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
counts = dict.fromkeys(a, 0)
for el in a: counts[el] += 1
print(counts)
# {1: 4, 2: 4, 3: 2, 4: 1, 5: 2}
d.b
  • 32,245
  • 6
  • 36
  • 77
1
a = [1,1,1,1,2,2,2,2,3,3,4,5,5]

# 1. Get counts and store in another list
output = []
for i in set(a):
    output.append(a.count(i))
print(output)

# 2. Remove duplicates using set constructor
a = list(set(a))
print(a)
  1. Set collection does not allow duplicates, passing a list to the set() constructor will give an iterable of totally unique objects. count() function returns an integer count when an object that is in a list is passed. With that the unique objects are counted and each count value is stored by appending to an empty list output
  2. list() constructor is used to convert the set(a) into list and referred by the same variable a

Output

D:\MLrec\venv\Scripts\python.exe D:/MLrec/listgroup.py
[4, 4, 2, 1, 2]
[1, 2, 3, 4, 5]
Sai Kiran
  • 61
  • 7
1

Simple solution using a dictionary.

def frequency(l):
     d = {}
     for i in l:
        if i in d.keys():
           d[i] += 1
        else:
           d[i] = 1

     for k, v in d.iteritems():
        if v ==max (d.values()):
           return k,d.keys()

print(frequency([10,10,10,10,20,20,20,20,40,40,50,50,30]))
oshaiken
  • 2,593
  • 1
  • 15
  • 25
0
#!usr/bin/python
def frq(words):
    freq = {}
    for w in words:
            if w in freq:
                    freq[w] = freq.get(w)+1
            else:
                    freq[w] =1
    return freq

fp = open("poem","r")
list = fp.read()
fp.close()
input = list.split()
print input
d = frq(input)
print "frequency of input\n: "
print d
fp1 = open("output.txt","w+")
for k,v in d.items():
fp1.write(str(k)+':'+str(v)+"\n")
fp1.close()
amrutha
  • 9
  • 1
0
from collections import OrderedDict
a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
def get_count(lists):
    dictionary = OrderedDict()
    for val in lists:
        dictionary.setdefault(val,[]).append(1)
    return [sum(val) for val in dictionary.values()]
print(get_count(a))
>>>[4, 4, 2, 1, 2]

To remove duplicates and Maintain order:

list(dict.fromkeys(get_count(a)))
>>>[4, 2, 1]
Veera Balla Deva
  • 790
  • 6
  • 19
0

i'm using Counter to generate a freq. dict from text file words in 1 line of code

def _fileIndex(fh):
''' create a dict using Counter of a
flat list of words (re.findall(re.compile(r"[a-zA-Z]+"), lines)) in (lines in file->for lines in fh)
'''
return Counter(
    [wrd.lower() for wrdList in
     [words for words in
      [re.findall(re.compile(r'[a-zA-Z]+'), lines) for lines in fh]]
     for wrd in wrdList])
roberto
  • 577
  • 6
  • 5
0

For the record, a functional answer:

>>> L = [1,1,1,1,2,2,2,2,3,3,4,5,5]
>>> import functools
>>> >>> functools.reduce(lambda acc, e: [v+(i==e) for i, v in enumerate(acc,1)] if e<=len(acc) else acc+[0 for _ in range(e-len(acc)-1)]+[1], L, [])
[4, 4, 2, 1, 2]

It's cleaner if you count zeroes too:

>>> functools.reduce(lambda acc, e: [v+(i==e) for i, v in enumerate(acc)] if e<len(acc) else acc+[0 for _ in range(e-len(acc))]+[1], L, [])
[0, 4, 4, 2, 1, 2]

An explanation:

  • we start with an empty acc list;
  • if the next element e of L is lower than the size of acc, we just update this element: v+(i==e) means v+1 if the index i of acc is the current element e, otherwise the previous value v;
  • if the next element e of L is greater or equals to the size of acc, we have to expand acc to host the new 1.

The elements do not have to be sorted (itertools.groupby). You'll get weird results if you have negative numbers.

jferard
  • 7,835
  • 2
  • 22
  • 35
0

Another approach of doing this, albeit by using a heavier but powerful library - NLTK.

import nltk

fdist = nltk.FreqDist(a)
fdist.values()
fdist.most_common()
Abhishek Poojary
  • 749
  • 9
  • 10
0

Found another way of doing this, using sets.

#ar is the list of elements
#convert ar to set to get unique elements
sock_set = set(ar)

#create dictionary of frequency of socks
sock_dict = {}

for sock in sock_set:
    sock_dict[sock] = ar.count(sock)
Abhishek Poojary
  • 749
  • 9
  • 10
0

For an unordered list you should use:

[a.count(el) for el in set(a)]

The output is

[4, 4, 2, 1, 2]
Luigi Tiburzi
  • 4,265
  • 7
  • 32
  • 43
  • Note that sets do not preserve order. As a result, the positions in the list and thus the meaning of the contained counts are completely arbitrary wrt the actual items. – MisterMiyagi Jul 30 '22 at 12:48
-1

Yet another solution with another algorithm without using collections:

def countFreq(A):
   n=len(A)
   count=[0]*n                     # Create a new list initialized with '0'
   for i in range(n):
      count[A[i]]+= 1              # increase occurrence for value A[i]
   return [x for x in count if x]  # return non-zero count
Reza Abtin
  • 205
  • 3
  • 7
-1
num=[3,2,3,5,5,3,7,6,4,6,7,2]
print ('\nelements are:\t',num)
count_dict={}
for elements in num:
    count_dict[elements]=num.count(elements)
print ('\nfrequency:\t',count_dict)
-1

You can use the in-built function provided in python

l.count(l[i])


  d=[]
  for i in range(len(l)):
        if l[i] not in d:
             d.append(l[i])
             print(l.count(l[i])

The above code automatically removes duplicates in a list and also prints the frequency of each element in original list and the list without duplicates.

Two birds for one shot ! X D

-1

This approach can be tried if you don't want to use any library and keep it simple and short!

a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
marked = []
b = [(a.count(i), marked.append(i))[0] for i in a if i not in marked]
print(b)

o/p

[4, 4, 2, 1, 2]
Namrata Tolani
  • 823
  • 9
  • 12
-2

One more way is to use a dictionary and the list.count, below a naive way to do it.

dicio = dict()

a = [1,1,1,1,2,2,2,2,3,3,4,5,5]

b = list()

c = list()

for i in a:

   if i in dicio: continue 

   else:

      dicio[i] = a.count(i)

      b.append(a.count(i))

      c.append(i)

print (b)

print (c)
-2
a=[1,2,3,4,5,1,2,3]
b=[0,0,0,0,0,0,0]
for i in range(0,len(a)):
    b[a[i]]+=1
Stephen Rauch
  • 47,830
  • 31
  • 106
  • 135
-4
str1='the cat sat on the hat hat'
list1=str1.split();
list2=str1.split();

count=0;
m=[];

for i in range(len(list1)):
    t=list1.pop(0);
    print t
    for j in range(len(list2)):
        if(t==list2[j]):
            count=count+1;
            print count
    m.append(count)
    print m
    count=0;
#print m
Stephen Rauch
  • 47,830
  • 31
  • 106
  • 135
  • 12
    You replied to a *very* old question with 13 others answers with nothing more than a code dump. This is not likely to be useful without some sort of explanation at to why it is better than the 13 other answers. When giving an answer it is preferable to give [some explanation as to WHY your answer](http://stackoverflow.com/help/how-to-answer) is the one. – Stephen Rauch Feb 22 '17 at 02:22