0

I have the following DataFrame:

df2 = pd.DataFrame({'Worker': ['A','A','B','B','C','D','D'],
                    'Task': ['T1','T2','T1','T3','T4','T1','T5'],
                    'Hour': [1,2,1,2,3,1,4]
                   })

The output is

  Worker Task  Hour
0      A   T1     1
1      A   T2     2
2      B   T1     1
3      B   T3     2
4      C   T4     3
5      D   T1     1
6      D   T5     4

I want to have table as below:

       Staff   Hour
Task     
 T1        3      3
 T2        1      2
 T3        1      2  
 T4        1      3
 T5        1      4

How could it be done? Thanks.

[Edit: Solution from @ayhan but with updated requirement]

df2.groupby('Task').agg({'Worker': 'size', 'Hour': 'sum'})

However, I have just realized that I do not count (i.e., size) the occurrence of 'Worker' but determine number of unique workers per task.

I have modified the DataFrame as follow:

df2 = pd.DataFrame({'Worker': ['A','A','B','B','C','D','D'],
                    'Task': ['T1','T1','T2','T3','T4','T2','T2'],
                    'Hour': [1,2,1,2,3,1,4]
                   })

The output is

  Worker Task  Hour
0      A   T1     1
1      A   T1     2
2      B   T2     1
3      B   T3     2
4      C   T4     3
5      D   T2     1
6      D   T2     4

And the desired output is

       Staff   Hour
Task     
 T1        1      3
 T2        2      6
 T3        1      2  
 T4        1      3
yoonghm
  • 4,198
  • 1
  • 32
  • 48
  • 2
    `df2.groupby('Task').agg({'Worker': 'size', 'Hour': 'sum'})` – ayhan Dec 02 '18 at 10:37
  • 1
    You can change `size` to `nunique` if you want the number of unique workers (`df2.groupby('Task').agg({'Worker': 'nunique', 'Hour': 'sum'})`). – ayhan Dec 02 '18 at 11:40

0 Answers0