Counter for multiple tuple conditions

Question

I have a tuple..

for i in my_tup:
   print(i)

Output:

   (Transfer, 0:33:20, Cycle 1)                 
   (Transfer, 0:33:10, Cycle 1)                    
   (Download, 0:09:10, Cycle 1)          
   (Transfer, 0:33:10, Cycle 1)                   
   (Download, 0:13:00, Cycle 1)            
   (Download, 0:12:30, Cycle 2)           
   (Transfer, 0:33:10, Cycle 2)              
   (Download, 0:02:00, Cycle 2)            
   (Transfer, 0:33:00, Cycle 2)              
   (Transfer, 0:33:00, Cycle 2)               
   (Transfer, 0:33:00, Cycle 2)            
   (Transfer, 0:32:40, Cycle 2)

I am trying to count the number of occurrences of 'Transfer' PER Cycle category. I.e. How many Transfer occurrences are in cycle 1, how many in cycle 2, etc...

I can work it out for the first cycle but not those after this.. (there are many more cycles in the real output).

accumulatedList = []
count = 0
for i in range(0 len(my_tup)):
     if my_tup[i][0] == 'Transfer' and my_tup[i][2] == 'Cycle 1':
          count +=1
     accumulatedList.append(count)

Not sure how to do it for other others too.

jlandercy · Answer 1 · 2018-03-26T14:01:44.737

Using pandas library it is straightforward:

import pandas as pd
df = pd.DataFrame(my_tup, columns=['Category', 'TimeSpan', 'Cycle'])
g = df.groupby(['Category', 'Cycle']).size()

It returns:

Category  Cycle  
Download  Cycle 1    2
          Cycle 2    2
Transfer  Cycle 1    3
          Cycle 2    5
dtype: int64

If your concern is only about transfer, slice it using index:

g['Transfer']

Cycle
Cycle 1    3
Cycle 2    5
dtype: int64

score 2 · Answer 2 · answered Mar 26 '18 at 15:33

You can use collections.Counter for an O(n) solution.

from collections import Counter

c = Counter()

for cat, time, cycle in lst:
    if cat == 'Transfer':
        c[cycle] += 1

Result

Counter({'Cycle 1': 3,
         'Cycle 2': 5})

Setup

lst =  [('Transfer', '0:33:20', 'Cycle 1'),                 
        ('Transfer', '0:33:10', 'Cycle 1'),        
        ('Download', '0:09:10', 'Cycle 1'),        
        ('Transfer', '0:33:10', 'Cycle 1'),                 
        ('Download', '0:13:00', 'Cycle 1'),          
        ('Download', '0:12:30', 'Cycle 2'),         
        ('Transfer', '0:33:10', 'Cycle 2'),            
        ('Download', '0:02:00', 'Cycle 2'),          
        ('Transfer', '0:33:00', 'Cycle 2'),            
        ('Transfer', '0:33:00', 'Cycle 2'),             
        ('Transfer', '0:33:00', 'Cycle 2'),          
        ('Transfer', '0:32:40', 'Cycle 2')]

Explanation

Use a collections.Counter object to increment a cycle key each time the category is "Transfer".

@wwii, agreed. But for this specific purpose I would argue `collections.Counter` is a better option. — jpp, Mar 26 '18 at 15:43

score 1 · Answer 3 · answered Mar 26 '18 at 13:48

You could do it with pandas

import pandas as pd

df = pd.DataFrame([("Transfer", "0:33:20", "Cycle 1"),
("Transfer", "0:33:10", "Cycle 1"),
("Download", "0:09:10", "Cycle 1"),
("Transfer", "0:33:10", "Cycle 1"),
("Download", "0:13:00", "Cycle 1"),
("Download", "0:12:30", "Cycle 2"),
("Transfer", "0:33:10", "Cycle 2"),
("Download", "0:02:00", "Cycle 2"),
("Transfer", "0:33:00", "Cycle 2"),
("Transfer", "0:33:00", "Cycle 2"),
("Transfer", "0:33:00", "Cycle 2"),
("Transfer", "0:32:40", "Cycle 2")])

df.groupby(2).size()

df.groupby(2).size()["Cycle 1"]
df.groupby(2).size()["Cycle 2"]

score 0 · Answer 4 · answered Mar 26 '18 at 13:41

0

You can use a dictionary to hold the results:

result = {}

for var, _, cycle in my_tup:
    if var == 'Transfer':
        try:
            result[cycle] += 1
        except KeyError:
            result[cycle] = 1

Then result would look something like:

{'Cycle 1': 3, 'Cycle 2': 5}

answered Mar 26 '18 at 13:41

ForceBru

43,482
10
63
98

2

Using try catch for checking if element is in dictionary is not a great idea. – MK. Mar 26 '18 at 13:54
@MK., why is that? – ForceBru Mar 26 '18 at 13:58
Because it is less readable and because it looks terrible to people who write in languages other than Python. – MK. Mar 26 '18 at 14:15
(you were hoping I'd say it is slower, weren't you? ;) – MK. Mar 26 '18 at 14:17
@MK., in Python it's preferable to use the approach called [Easier to Ask for Forgiveness than Permission](https://docs.python.org/2/glossary.html#term-eafp). It may seem "terrible" and less readable to people who write in other languages that use the [Look Before You Leap](https://docs.python.org/2/glossary.html#term-lbyl) style, which is understandable, but that's just a matter of personal preference. However, EAFP is considered to be the [Pythonic way](https://stackoverflow.com/questions/12265451/ask-forgiveness-not-permission-explain) of doing this. – ForceBru Mar 26 '18 at 14:35
I'm all for Pythonic way where it doesn't make python contradict other languages best practices. Python is, in my opinion, the best second language to have, and as such I want it to be awesome at being the second language. – MK. Mar 26 '18 at 14:42
The idea is good, but `collections.Counter` is a ready-made tool for this (see my solution). – jpp Mar 26 '18 at 15:34
@ForceBru, when you use `try` this way it is the same design as `if` even if the internal behavior is different. You are branching case not catching edge case. IMO, your design is more LBYL than EAFP. – jlandercy Mar 27 '18 at 06:43

score 0 · Answer 5 · answered Mar 26 '18 at 15:40

Sort and group by the first and last items of the tuples; iterate over the groups and add Transfer groups to a dictionary.

import operator, itertools, collections

a = [('Transfer', '0:33:20', 'Cycle 1'),('Transfer', '0:33:10', 'Cycle 1'),
     ('Download', '0:09:10', 'Cycle 1'),('Transfer', '0:33:10', 'Cycle 1'),
     ('Download', '0:13:00', 'Cycle 1'),('Download', '0:12:30', 'Cycle 2'),
     ('Transfer', '0:33:10', 'Cycle 2'),('Download', '0:02:00', 'Cycle 2'),
     ('Transfer', '0:33:00', 'Cycle 2'),('Transfer', '0:33:00', 'Cycle 2'),
     ('Transfer', '0:33:00', 'Cycle 2'),('Transfer', '0:32:40', 'Cycle 2')]

key = operator.itemgetter(0,2)

a.sort(key=key)
d = {}
for (direction, cycle), group in itertools.groupby(a, key):
    g = list(group)
    if direction == 'Transfer':
        d[cycle] = len(g)
    #print(direction, cycle, g)

>>> d
... {'Cycle 1': 3, 'Cycle 2': 5}

>>>

Counter for multiple tuple conditions

5 Answers5