0


This is input table in pandas: enter image description here

this is an output table as shown below: enter image description here dtype: int64

Dear Friends,
I am new to pandas, how to get the result is shown in the second image using pandas. I am getting output as shown below using this approach

"df.groupby(['Months', 'Status']).size()"


Months Status
Apr-20 IW 2
OW 1
Jun-20 IW 4
OW 4
May-20 IW 3
OW 2

dtype: int64
But how to convert this output as shown in the second image? It will be more helpful if someone is able to help me. Thanks in advance.

2 Answers2

2

Use crosstab with margins=True parameter, then if necessary remove last Total column, change order of columns by DataFrame.reindex with ordering of original column and last convert index to column by DataFrame.reset_index and remove columns names by DataFrame.rename_axis:

df = (pd.crosstab(df['Status'], df['Months'],  margins_name='Total', margins=True)
       .iloc[:, :-1]
       .reindex(df['Months'].unique(), axis=1)
       .reset_index()
       .rename_axis(None, axis=1))
print (df)
  Status  Apr_20  May_20  Jun_20
0     IW       4       2       4
1     OW       1       2       4
2  Total       5       4       8
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • How to remove the below row is having 'Total', from your solutions, I got the Total at Top too, whereas I need to remove the below Total and also I would like to add a column to the left side as shown in the second the image – Balrajswamy Jan 03 '20 at 10:45
  • @Balrajswamy - Add `df = df.reset_index().rename_axis(None, axis=1)` – jezrael Jan 03 '20 at 11:05
  • Yes, I could remove the Total at Bottom, Thanks a lot for your great help. – Balrajswamy Jan 03 '20 at 11:27
  • In addition to the above solutions, i want to add a column at zero index. when I tried to add a column using df.insert(0,'Column_name', 'XXX'), but it went to second column. i did not know this. pls suggest me that what is wrong in the syntax? – Balrajswamy Jan 03 '20 at 11:30
  • @Balrajswamy - First column in my solution `0,1,2` is called index, so if use `df.insert(0,'Column_name', 'XXX')` it add column before `Status`. If want set `index` to same values use `df.index = np.repeat('XXX', len(df))` – jezrael Jan 03 '20 at 11:32
  • Dear Friend, I would like to add a list to the df at '0 ' index : – Balrajswamy Jan 03 '20 at 11:45
  • example:
    df['comp_name']=['XXX','KA','IN']
    df.index = np.repeat('XXX', df['comp_name'])
    – Balrajswamy Jan 03 '20 at 11:45
  • @Balrajswamy - then use `df.index = ['XXX','KA','IN']` – jezrael Jan 03 '20 at 11:46
1

Unstack, and then transpose:

df = df.groupby(['Months', 'Status']).size().unstack().T

To get a total row:

df.sum().rename('Total').to_frame().T.append(df)
ignoring_gravity
  • 6,677
  • 4
  • 32
  • 65