-1

I was trying to make a sub totals and grand totals for a data. But some where i stuck and couldn't make my deserved output. Could you please assist on this.

data.groupby(['Column4', 'Column5'])['Column1'].count()

Current Output:

            Column4        Column5       
2018-05-19  Duplicate         220
            Informative         3
2018-05-20  Actionable          5
            Duplicate         270
            Informative       859
            Non-actionable      2
2018-05-21  Actionable          8
            Duplicate         295
            Informative        17
2018-05-22  Actionable         10
            Duplicate         424
            Informative        36
2018-05-23  Actionable          8
            Duplicate         157
            Informative         3
2018-05-24  Actionable          5
            Duplicate          78
            Informative         3
2018-05-25  Actionable          3
            Duplicate          80

Expected Output:

            Row Labels Actionable   Duplicate   Informative Non-actionable  Grand Total
5/19/2018       219                 3           222
5/20/2018   5   270    859          2           1136
5/21/2018   8   295    17           320
5/22/2018   10  424    36           470
5/23/2018   8   157    3            168
5/24/2018   5   78     3            86
5/25/2018   3   80     83
Grand Total 39  1523   921          2           2485

This is a sample data. Could you please have a look with before my ask. I am getting minuted errors. May be i wasn't gave right data. Please kindly check for once. Column1 Column2 Column3 Column4 Column5 Column6 BI Account Subject1 2:12 PM 5/19/2018 Duplicate Name1 PI Account Subject2 1:58 PM 5/19/2018 Actionable Name2 AI Account Subject3 5:01 PM 5/19/2018 Non-Actionable Name3 BI Account Subject4 5:57 PM 5/19/2018 Informative Name4 PI Account Subject5 6:59 PM 5/19/2018 Duplicate Name5 AI Account Subject6 8:07 PM 5/19/2018 Actionable Name1

Dinesh Gedda
  • 39
  • 1
  • 9
  • 2
    Welcome to SO. Please provide a **[mcve]**. As such, your specific problem appears unclear as we do not have a sample of your raw data, nor do we know what logic you use to reach your desired output. – jpp May 28 '18 at 23:26

3 Answers3

1

You can use pivot to get from your current output to your desired output and then sum to calculate the totals you want.

import pandas as pd

df = df.reset_index().pivot('index', values='Column5', columns='Column4')

# Add grand total columns, summing across all other columns
df['Grand Total'] = df.sum(axis=1)
df.columns.name = None
df.index.name = None

# Add the grand total row, summing all values in a column
df.loc['Grand Total', :] = df.sum()

df is now:

             Actionable  Duplicate  Informative  Non-actionable  Grand Total
2018-05-19          NaN      220.0          3.0             NaN        223.0
2018-05-20          5.0      270.0        859.0             2.0       1136.0
2018-05-21          8.0      295.0         17.0             NaN        320.0
2018-05-22         10.0      424.0         36.0             NaN        470.0
2018-05-23          8.0      157.0          3.0             NaN        168.0
2018-05-24          5.0       78.0          3.0             NaN         86.0
2018-05-25          3.0       80.0          NaN             NaN         83.0
Grand Total        39.0     1524.0        921.0             2.0       2486.0
ALollz
  • 57,915
  • 7
  • 66
  • 89
  • Thanks for quick come back and i am getting an error called cannot insert Column1, already exists – Dinesh Gedda May 28 '18 at 23:55
  • @dineshgedda It seems that the current output you provided isn't complete then. The problem is with `.reset_index()`; you cant reset the index if the index name is the same as a column that already exists. Is the index of your current output named `Column1` and does your current output also have another column that is not shown named `Column1`? – ALollz May 29 '18 at 00:04
  • Actual data is Column1: Source Column2: Subject Column3: Timestamp Column4: Date Column5:Grouped Names( i addressed in previous post) Column6: Name (Person). I think it was mis manipulated over publishing data under Column4:Date will come and Column5: Grouped names called(Actionable/Duplicate--). I thought of that getting error for above one. But Thanks a lot and can you just re-check with my current input. @petersmith – Dinesh Gedda May 29 '18 at 00:13
  • yes you are right Column1 already existed – Dinesh Gedda May 29 '18 at 00:37
  • 1
    i just made few changes often troubleshoot with index issue as you mentioned. I got the desired output. Thanks a lot for your assistance. – Dinesh Gedda May 29 '18 at 11:20
1

Just using crosstab

pd.crosstab(df['Column4'], df['Column5'], margins = True, margins_name = 'Grand Total' )
BENY
  • 317,841
  • 20
  • 164
  • 234
0

Take a look at this : https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.pivot.html

You need to pivot your table :

df.reset_index().pivot(index='date', columns='Column4', values='Column5')
Mohamed AL ANI
  • 2,012
  • 1
  • 12
  • 29