Count number of occurrences of each unique item in list of lists

Question

I have a list of lists like the following:

listoflist = [["A", "B", "A", "C", "D"], ["Z", "A", "B", "C"], ["D", "D", "X", "Y", "Z"]]

I want to find the number of sublists that each unique value in listoflist occurs in. For example, "A" shows up in two sublists, while "D" shows up in two sublists also, even though it occurs twice in listoflist[3].

How can I get a dataframe which has each unique element in one column and the frequency (number of sublists each unique element shows up in)?

would this help you? https://stackoverflow.com/a/11829457/2572645 — Andy K, May 20 '18 at 19:40
You said: "How can I get a dataframe...". Are you working with Pandas, and searching for a Pandas specific solution? If so, please mention it in the question and add a pandas tag to your question. — akaihola, May 21 '18 at 17:33

llllllllll · Answer 1 · 2018-05-20T19:23:08.233

3

You can use: itertools.chain together with collections.Counter:

In [94]: import itertools as it

In [95]: from collections import Counter

In [96]: Counter(it.chain(*map(set, listoflist)))
Out[96]: Counter({'A': 2, 'B': 2, 'C': 2, 'D': 2, 'X': 1, 'Y': 1, 'Z': 2})

As mentioned in the comment by @Jean-François Fabre, you can also use:

In [97]: Counter(it.chain.from_iterable(map(set, listoflist)))
Out[97]: Counter({'A': 2, 'B': 2, 'C': 2, 'D': 2, 'X': 1, 'Y': 1, 'Z': 2})

edited May 20 '18 at 19:23

answered May 20 '18 at 19:17

llllllllll

16,169
4
31
54

1

`it.chain(*map(set, listoflist))` => `it.chain.from_iterable(map(set, listoflist))` – Jean-François Fabre May 20 '18 at 19:20
@Jean-FrançoisFabre Thanks. – llllllllll May 20 '18 at 19:23

Andrey Tyukin · Answer 2 · 2018-05-20T19:27:34.810

Essentially, it seems that you want something like

Counter(x for xs in listoflist for x in set(xs))

Each list is converted into a set first, to exclude duplicates. Then the sequence of sets is flatmapped and fed into the Counter.

Full code:

from collections import Counter

listoflist = [["A", "B", "A", "C", "D"], ["Z", "A", "B", "C"], ["D", "D", "X", "Y", "Z"]]

c = Counter(x for xs in listoflist for x in set(xs))

print(c)

Results in:

# output:
# Counter({'B': 2, 'C': 2, 'Z': 2, 'D': 2, 'A': 2, 'Y': 1, 'X': 1})

score 1 · Answer 3 · answered May 21 '18 at 17:03

1

Another way to do this is to use pandas:

import pandas as pd

df = pd.DataFrame(listoflist)
df.stack().reset_index().groupby(0)['level_0'].nunique().to_dict()

Output:

{'A': 2, 'B': 2, 'C': 2, 'D': 2, 'X': 1, 'Y': 1, 'Z': 2}

answered May 21 '18 at 17:03

Scott Boston

147,308
15
139
187

Count number of occurrences of each unique item in list of lists

3 Answers3