Exclude zeros in collections.Counter in Python

Question

Is there a way that collections.Counter doesn't count/ignores a given value (here 0):

from collections import Counter
import numpy as np

idx = np.random.randint(4, size=(100,100))
most_common = np.zeros(100)
num_most_common = np.zeros(100)

for i in range(100):
    most_common[i], num_most_common[i] = Counter(idx[i, :]).most_common(1)[0]

So if 0 is the most common value it should give the second most common value. In addition, is there a way to avoid the for loop in this case?

just take the top 2 most common, and then the 2nd one if the first one is `0` — Chris_Rands, Nov 27 '19 at 12:57
@Divakar Yes, I will mark it as solved in a moment. In addition: Is it possible to generalize this for higher dimensions? — clearseplex, Nov 28 '19 at 07:34
@clearseplex At least for my solution, you can use `idx.reshape(-1,idx.shape[-1])` as the input in place of `idx`. Solution stays the same. — Divakar, Nov 28 '19 at 07:49

Divakar · Accepted Answer · 2019-11-27T13:22:41.710

For positive numbers, we can use vectorized-bincount - bincount2D_vectorized -

# https://stackoverflow.com/a/46256361/ @Divakar
def bincount2D_vectorized(a):    
    N = a.max()+1
    a_offs = a + np.arange(a.shape[0])[:,None]*N
    return np.bincount(a_offs.ravel(), minlength=a.shape[0]*N).reshape(-1,N)

# Get binned counts per row, with each number representing a bin 
c = bincount2D_vectorized(idx)

# Skip the first element, as that represents counts for 0s.
# Get most common element and count per row
most_common = c[:,1:].argmax(1)+1
num_most_common = c[:,1:].max(1)
# faster : num_most_common = c[np.arange(len(most_common)),most_common]

For generic int numbers, we could extend like so -

s = idx.min()
c = bincount2D_vectorized(idx-s)
c[:,-s] = 0
most_common = c.argmax(1)
num_most_common = c[np.arange(len(most_common)),most_common]
most_common += s

CDJB · Answer 2 · 2019-11-27T12:47:11.203

2

You can do the following, using a generator to only count something if it is not 0.

most_common = np.array([Counter(x for x in r if x).most_common(1)[0][0] for r in idx])
num_most_common = np.array([Counter(x for x in r if x).most_common(1)[0][1] for r in idx])

or even

count = np.array([Counter(x for x in r if x).most_common(1)[0] for r in idx])
most_common = count[:,0]
num_most_common = count[:,1]

edited Nov 27 '19 at 12:47

answered Nov 27 '19 at 12:25

CDJB

14,043
5
29
55

3

No need for the list-comp, a generator is enough: `Counter(x for x in idx[i] if x)` – tobias_k Nov 27 '19 at 12:35

Exclude zeros in collections.Counter in Python

2 Answers2