0

I have following list:

files_list = ['pic1.jpg', 'pic2.jpg', 'pic3.jpg', 'movie1.mov', 'movie2.mov', 'doc1.pdf', 'doc2.pdf', 'doc3.pdf', 'doc4.pdf']

I want to count the number of items with a particular file extension and store it in a dictionary.

Expected output is:

extn_dict = {'jpg': 3, 'mov': 2, 'pdf': 4}

I'm writing following code:

for item in files_list:
    extn_dict[item[-3:]] = count(item) # I understand I should not have count() here but I'm not sure how to count them.

How can I count the number of items in the list with a particular extension?

Hannan
  • 1,171
  • 6
  • 20
  • 39
  • Possible duplicate of [How to count the occurrences of a list item?](https://stackoverflow.com/questions/2600191/how-to-count-the-occurrences-of-a-list-item) – Peter Wood Feb 15 '18 at 18:55

7 Answers7

12
>>> from collections import Counter
>>> files_list
['pic1.jpg', 'pic2.jpg', 'pic3.jpg', 'movie1.mov', 'movie2.mov', 'doc1.pdf', 'doc2.pdf', 'doc3.pdf', 'doc4.pdf']
>>> c = Counter(x.split(".")[-1] for x in files_list)
>>> c
Counter({'pdf': 4, 'jpg': 3, 'mov': 2})
>>> 
Gahan
  • 4,075
  • 4
  • 24
  • 44
jrjames83
  • 901
  • 2
  • 9
  • 22
  • True - to any readers in the future, Counter accepts an iterable, be it a list or a generator expression. Even >>> Counter('jeff') Counter({'f': 2, 'j': 1, 'e': 1}) – jrjames83 Feb 15 '18 at 19:02
2

The easiest way is probably:

>>> d = {}
>>> for item in files_list:
...     d[item[-3:]] = d.get(item[-3:], 0) + 1
... 
>>> d
{'pdf': 4, 'mov': 2, 'jpg': 3}
l'L'l
  • 44,951
  • 10
  • 95
  • 146
  • 1
    I've fiddled with this way too long trying `setdefault` and `update`, all the while totally oblivious of `get` providing a default return value... thanks for the lesson. – r.ook Feb 15 '18 at 19:51
1

The easiest way is to loop over the list and use a dictionary to store your counts.

files_list = ['pic1.jpg', 'pic2.jpg', 'pic3.jpg', 'movie1.mov', 
              'movie2.mov', 'doc1.pdf', 'doc2.pdf', 'doc3.pdf', 'doc4.pdf']
counts = {}
for f in f:
    ext = f[-3:]
    if ext not in counts:
        counts[ext] = 0
    counts[ext] += 1

print counts
#{'pdf': 4, 'mov': 2, 'jpg': 3}

No doubt, there are other fancy solutions, but I think this is easier to understand.

If you can't assume that extension will always be 3 characters, then you can change the ext = line to:

ext = f.split(".")[-1]

As other posters have shown in their answers.

pault
  • 41,343
  • 15
  • 107
  • 149
1
files_list = ['pic1.jpg', 'pic2.jpg', 'pic3.jpg', 'movie1.mov', 'movie2.mov', 'doc1.pdf', 'doc2.pdf', 'doc3.pdf', 'doc4.pdf']
extension_set = [i.split('.')[-1] for i in files_list]
d = {j:extension_set.count(j) for j in extension_set}
print(d)

Analysis:

Current method - 10000 loops, best of 3: 25.3 µs per loop

Counter - 10000 loops, best of 3: 30.5 µs per loop(best of 3: 33.3 µs per loop with import statement)

itertools - 10000 loops, best of 3: 41.1 µs per loop(best of 3: 44 µs per loop with import statement)

Community
  • 1
  • 1
Gahan
  • 4,075
  • 4
  • 24
  • 44
0

You can use itertools.groupby:

import itertools
files_list = ['pic1.jpg', 'pic2.jpg', 'pic3.jpg', 'movie1.mov', 'movie2.mov', 'doc1.pdf', 'doc2.pdf', 'doc3.pdf', 'doc4.pdf']
final_counts = {a:len(list(b)) for a, b in itertools.groupby(sorted(files_list, key=lambda x:x.split('.')[-1]), key=lambda x:x.split('.')[-1])}

Output:

{'pdf': 4, 'mov': 2, 'jpg': 3}
Ajax1234
  • 69,937
  • 8
  • 61
  • 102
0

you can use the Counter function from collection module

from collections import Counter
files_list = ['pic1.jpg', 'pic2.jpg', 'pic3.jpg', 'movie1.mov', 'movie2.mov', 'doc1.pdf', 'doc2.pdf', 'doc3.pdf', 'doc4.pdf']
temp = []
for item in files_list:
    temp.append(item[-3:])

print Counter(temp)
>>> Counter({'pdf': 4, 'jpg': 3, 'mov': 2})
om tripathi
  • 300
  • 1
  • 5
  • 20
0

Using counter and map instead of list comprehension

Counter(map(lambda x : x.split('.')[-1], files_list))
Espoir Murhabazi
  • 5,973
  • 5
  • 42
  • 73