counting the occurrences in a list of lists python

Question

I have a list of lists in python:

x=[['1', '1', '2', '2', '2', '2', '1', '0', '0'],  
['1', '1', '1', '0', '0', '1', '1', '0', '0'], 
['0', '0', '1', '2', '1', '0', '2', '1', '1']]

I want to know how count the occurrences in the list of lists

My output should be like (without using numpy and Counter):

{'1': 3, '2': 4, '0': 2}
{'1': 5, '0': 4}
{'0': 3, '1': 4, '2': 2}

Now, I have the solution which works for only one list, but doesn't work for list of lists.

newlist=[]
for el in x:
    n=el[0]
    newlist.append(n)
print(newlist)
 
list2=dict((i, newlist.count(i)) for i in newlist)
print(list2)

I did't find an answer on another thread. Is anyone able to help? :)

`dict((i, newlist.count(i)) for i in newlist)` is an inefficient algorithm fo counting. See the two top-rated answers in the linked duplicate. — juanpa.arrivillaga, Apr 22 '21 at 08:41

Koen02 · Answer 1 · 2021-04-22T08:16:23.513

2

count_dict = []
for el in x:
    count = {}
    for i in el: count[i] = count.get(i, 0) + 1
    count_dict.append(count)

count_dict will look like this:

[
  {'1': 3, '2': 4, '0': 2},
  {'1': 5, '0': 4},
  {'0': 3, '1': 4, '2': 2}
]

edited Apr 22 '21 at 08:16

answered Apr 22 '21 at 07:49

Koen02

71
3

This is an unnecessarily inefficient algorithm. Don't use it. It is O(N**2) and you can trivially do it in O(N) – juanpa.arrivillaga Apr 22 '21 at 07:57
I closed the question, this is a well-known duplicate. See the duplicate target for an efficient algorithm, basically, `count = {}` then just `for x in sublist: count[x] = count.get(x, 0) + 1` Which is what `collections.Counter` implements, only in optimized C-level code – juanpa.arrivillaga Apr 22 '21 at 08:07
Yeah. this only works for 1D lists. He's asking for 2D lists. Hence the extra for loop. – Koen02 Apr 22 '21 at 08:10
Obviously it is meant to count the results *in each sublist*. It is meant to replace `{i:el.count(i) for i in el}` not the whole thing. This is a well-known anti-pattern – juanpa.arrivillaga Apr 22 '21 at 08:11
Yeah but you're also using a for loop. So isn't this O(N**2) as well? – Koen02 Apr 22 '21 at 08:12
No. `{i:el.count(i) for i in el}` is quadratic time on the size of the sublist. The algorithm I used above is linear time on the size of the sublist. Note, `el.count(i)` is linear time. Again, **this is a well-known antipattern with a well-known alternative**. – juanpa.arrivillaga Apr 22 '21 at 08:13

score 0 · Answer 2 · answered Apr 22 '21 at 07:48

0

You are just taking the first 'sublist' into account, loop once more:

x = [
    ['1', '1', '2', '2', '2', '2', '1', '0', '0'],
    ['1', '1', '1', '0', '0', '1', '1', '0', '0'],
    ['0', '0', '1', '2', '1', '0', '2', '1', '1']
]

for lst in x:
    newlist = []
    for sublst in lst:
        newlist.append(sublst)

    list2 = dict((i, newlist.count(i)) for i in newlist)
    print(list2)

Out:

{'1': 3, '2': 4, '0': 2}
{'1': 5, '0': 4}
{'0': 3, '1': 4, '2': 2}

answered Apr 22 '21 at 07:48

Maurice Meyer

17,279
4
30
47

Unnecessarily inefficient algorithm. – juanpa.arrivillaga Apr 22 '21 at 07:58
@juanpa.arrivillaga: All that 'inefficiency' blaming won't make it better and OP didn't ask about that, in the first place. The first dup has no accepted answer, both questions got a lot of answers, I don't get which of those answers you are referring to. It would be more understandable if you at least comment on the question ... – Maurice Meyer Apr 22 '21 at 08:34
What? What are you talking about "inefficiency blaming". Look I'm pointing out that this is a classic anti-pattern, i.e. `result = {x:mylist.count(x) for x in mylist}` is quadratic time. You can use a simple `result = {}` then `for x in mylist: result[x] = result.get(x,0) + 1` for a linear time solution. A well-known solution. In the linked duplicate, the top two rated answers have this more efficient algorithm (`collections.Counter` does this for `Counter(mylist)`, just that it is implemented in optimized C-level code). – juanpa.arrivillaga Apr 22 '21 at 08:40

xio · Answer 3 · 2021-04-22T08:16:51.900

0

You can use this method

x=[['1', '1', '2', '2', '2', '2', '1', '0', '0'],  
  ['1', '1', '1', '0', '0', '1', '1', '0', '0'], 
  ['0', '0', '1', '2', '1', '0', '2', '1', '1']]

for i in x:
    data={}
    for j in i:
        if j not in data:
            data[j]=i.count(j)
    print(data)

output:

or :

for i in x:
    data={}
    for j in i:
        data[j]=i.count(j)
    print(data)

output :

edited Apr 22 '21 at 08:16

answered Apr 22 '21 at 07:56

xio

630
5
11

This algorithm is unnecessarily inefficient. – juanpa.arrivillaga Apr 22 '21 at 07:58
I think there are other ways to improve the algorithm and this is a simple example – xio Apr 22 '21 at 08:17
I'm merely pointing this out because this algorithm is a well-known anti-pattern, with a well-known alternative. See the linked duplicate above. – juanpa.arrivillaga Apr 22 '21 at 08:20
If you look at the image above, `if j not in data` Prevents unintentional use of the `count` method – xio Apr 22 '21 at 08:31
Naturally, calculating duplicate values in the list is heavier than checking the availability of a value – xio Apr 22 '21 at 08:33
Ah, nevermind, I misread it. – juanpa.arrivillaga Apr 22 '21 at 08:43

counting the occurrences in a list of lists python

3 Answers3