Fastest way to count number of occurrences in a Python list

Question

I have a Python list and I want to know what's the quickest way to count the number of occurrences of the item, '1' in this list. In my actual case, the item can occur tens of thousands of times which is why I want a fast way.

['1', '1', '1', '1', '1', '1', '2', '2', '2', '2', '7', '7', '7', '10', '10']

Which approach: .count or collections.Counter is likely more optimized?

Is the list always sorted? Are you always counting the first item? — jscs, Sep 17 '12 at 03:03
possible duplicate of [How to calculate the occurrences of a list item in Python?](http://stackoverflow.com/questions/2600191/how-to-calculate-the-occurrences-of-a-list-item-in-python) — jscs, Sep 17 '12 at 03:05
@JoshCaswell No the list is not sorted and I'd count any item. I wasn't sure which approach: `count` or `collections.Counter` was better optimized which is why I asked — prrao, Sep 17 '12 at 03:08
@prrao Depends if you want to do this multiple times or not. — jamylak, Sep 17 '12 at 03:10
@jamylak Yes I want to do this multiple times, for multiple items. — prrao, Sep 17 '12 at 03:11

Jakob Bowyer · Accepted Answer · 2012-09-17T03:15:17.050

a = ['1', '1', '1', '1', '1', '1', '2', '2', '2', '2', '7', '7', '7', '10', '10']
print a.count("1")

It's probably optimized heavily at the C level.

Edit: I randomly generated a large list.

In [8]: len(a)
Out[8]: 6339347

In [9]: %timeit a.count("1")
10 loops, best of 3: 86.4 ms per loop

Edit edit: This could be done with collections.Counter

a = Counter(your_list)
print a['1']

Using the same list in my last timing example

In [17]: %timeit Counter(a)['1']
1 loops, best of 3: 1.52 s per loop

My timing is simplistic and conditional on many different factors, but it gives you a good clue as to performance.

Here is some profiling

In [24]: profile.run("a.count('1')")
         3 function calls in 0.091 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.091    0.091 <string>:1(<module>)
        1    0.091    0.091    0.091    0.091 {method 'count' of 'list' objects}

        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Prof
iler' objects}



In [25]: profile.run("b = Counter(a); b['1']")
         6339356 function calls in 2.143 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    2.143    2.143 <string>:1(<module>)
        2    0.000    0.000    0.000    0.000 _weakrefset.py:68(__contains__)
        1    0.000    0.000    0.000    0.000 abc.py:128(__instancecheck__)
        1    0.000    0.000    2.143    2.143 collections.py:407(__init__)
        1    1.788    1.788    2.143    2.143 collections.py:470(update)
        1    0.000    0.000    0.000    0.000 {getattr}
        1    0.000    0.000    0.000    0.000 {isinstance}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Prof
iler' objects}
  6339347    0.356    0.000    0.356    0.000 {method 'get' of 'dict' objects}

Which approach do you think is better optimized? I guess the better option is case dependent? — prrao, Sep 17 '12 at 03:09
@prrao. In this case `count` is ~20x faster than creating a `Counter`, but the same `Counter` can be used to retrieve counts of multiple different value at very low extra cost. If you need to count 20 or more values from the same list `Counter` will be more efficient than running `.count()` 20 times — John La Rooy, Sep 17 '12 at 05:16
I was working with data set of 1,000,000 integers where the range of set was 100, i.e each element was repeated around 10,000 times. Using `Counter` instead of `.count` brought down my time by half. +1 for `Counter`. — shshnk, Sep 26 '15 at 17:23
And I was working with a list of 350,000 strings (urls): using Counter took less than a second while I had time to drink a smoothie waiting for .count() to be done, so +1 again for Counter :) (Indeed I was counting every distinct url so, as said before, it's better to use Counter in this case). — pawamoy, May 19 '16 at 11:38
I must be missing something. Working with _list[long]_ datasets (containing `random.randint(0, sys.maxsize)` numbers, up to 50M of them)` , trying to count another `randint` with the same parameters, `.cont` is ~10 times faster than `Counter` (only try to count once). Also, I switched to _generators_ that `Counter` knows how to handle, but the combined times (generating the list/generator + counting) favors `list` and `.count`. The behavior is consistent across _Python3_ and _Python2_. — CristiFati, Aug 21 '17 at 22:30

score 21 · Answer 2 · edited Mar 14 '16 at 12:24

By the use of Counter dictionary counting the occurrences of all element as well as most common element in python list with its occurrence value in most efficient way.

If our python list is:-

l=['1', '1', '1', '1', '1', '1', '2', '2', '2', '2', '7', '7', '7', '10', '10']

To find occurrence of every items in the python list use following:-

\>>from collections import Counter

\>>c=Counter(l)

\>>print c

Counter({'1': 6, '2': 4, '7': 3, '10': 2})

To find most/highest occurrence of items in the python list:-

\>>k=c.most_common()

\>>k

[('1', 6), ('2', 4), ('7', 3), ('10', 2)]

For Highest one:-

\>>k[0][1]

6

For the item just use k[0][0]

\>>k[0][0]

'1'

For nth highest item and its no of occurrence in the list use follow:-

**for n=2 **

\>>print k[n-1][0] # For item

2

\>>print k[n-1][1] # For value

4

score 3 · Answer 3 · answered Oct 09 '18 at 07:43

You can use pandas, by transforming the list to a pd.Series then simply use .value_counts()

import pandas as pd
a = ['1', '1', '1', '1', '1', '1', '2', '2', '2', '2', '7', '7', '7', '10', '10']
a_cnts = pd.Series(a).value_counts().to_dict()

Input  >> a_cnts["1"], a_cnts["10"]
Output >> (6, 2)

score 1 · Answer 4 · answered Jan 16 '18 at 03:44

1

Combination of lambda and map function can also do the job:

list_ = ['a', 'b', 'b', 'c']
sum(map(lambda x: x=="b", list_))
:2

answered Jan 16 '18 at 03:44

Mahdi Ghelichi

1,090
14
23

score -1 · Answer 5 · answered Dec 03 '16 at 18:18

You can convert list in string with elements seperated by space and split it based on number/char to be searched..

Will be clean and fast for large list..

>>>L = [2,1,1,2,1,3]
>>>strL = " ".join(str(x) for x in L)
>>>strL
2 1 1 2 1 3
>>>count=len(strL.split(" 1"))-1
>>>count
3

Fastest way to count number of occurrences in a Python list

5 Answers5

Linked

Related