pandas pivot-table how to add nested columns

Question

df

    billsec       disposition    Date           Hour
0   185            ANSWERED     2016-11-01       00
1   0             NO ANSWER     2016-11-01       00
2   41             ANSWERED     2016-11-01       01
3   4              ANSWERED     2016-12-02       05

There is a table, me need to get out of it a summary table with the following data:

enter image description here

The rows are hours of the day, and the columns are the days, in the days of the total number of calls / missed / total duration of calls.

How to add additional columns (All, Lost, Time) in this table. I have so far turned out only to calculate the total duration of calls per hour, and their total number. Truth in different tables...

df.pivot_table(rows='Hour',cols='Date',aggfunc=len,fill_value=0)
df.pivot_table(rows='Hour',cols='Date',aggfunc=sum,fill_value=0)

Welcome to StackOverflow. Please post your data sets as text so people could copy and paste them and use them for coding an answer - it's not possible when you use images. Please read [how to make good reproducible pandas examples](http://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) — MaxU - stand with Ukraine, Feb 13 '17 at 22:15
Could you also explain how would you like to calculate `All`, `Lost`, `Time` columns? — MaxU - stand with Ukraine, Feb 15 '17 at 19:55
I thought that: `All` - is the number of rows in the day for an hour `Lost` - a sample of the `disposition` column for "no answer" `Time` - the sum of the values in the column for that day hour — luboff, Feb 15 '17 at 21:43
@MaxU It was great, exactly what I needed. Thank you so much — luboff, Feb 16 '17 at 14:57
glad i could help :-). Please consider [accepting](http://meta.stackexchange.com/a/5235/348814) an answer if you think it has answered your question. — MaxU - stand with Ukraine, Feb 16 '17 at 14:58

MaxU - stand with Ukraine · Accepted Answer · 2017-02-15T22:16:47.553

IIUC you can do it this way:

assuming we have the following DataFrame:

In [248]: df
Out[248]:
             calldate  billsec disposition
0 2016-11-01 00:05:26      185    ANSWERED
1 2016-11-01 00:01:26        0   NO ANSWER
2 2016-11-01 00:05:19       41    ANSWERED
3 2016-11-01 00:16:02        4    ANSWERED
4 2016-11-02 01:16:02       55    ANSWERED
5 2016-11-02 02:02:02        2   NO ANSWER

we can do the following:

funcs = {
    'billsec': {
        'all':'size',
        'time':'sum'
    },
    'disposition': {
        'lost': lambda x: (x == 'NO ANSWER').sum()
    }
}

(df.assign(d=df.calldate.dt.strftime('%d.%m'), t=df.calldate.dt.hour)
   .groupby(['t','d'])[['billsec','disposition']].agg(funcs)
   .unstack('d', fill_value=0)
   .swaplevel(axis=1)
   .sort_index(level=[0,1], axis=1)
)

yields:

d 01.11           02.11
    all time lost   all time lost
t
0     4  230    1     0    0    0
1     0    0    0     1   55    0
2     0    0    0     1    2    1

pandas pivot-table how to add nested columns

1 Answers1