Appending column totals to a Pandas DataFrame

Question

I have a DataFrame with numerical values. What is the simplest way of appending a row (with a given index value) that represents the sum of each column?

score 106 · Answer 1 · edited Dec 08 '16 at 19:53

106

To add a Total column which is the sum across the row:

df['Total'] = df.sum(axis=1)

edited Dec 08 '16 at 19:53

glifchits

683
7
26

answered Apr 09 '15 at 14:23

ideate

1,193
1
7
2

1

Does not give a meaningful result, if there are non-numeric columns in the df. – Holger Brandl Apr 17 '18 at 13:38
10

This answer adds a column, not a row as the asker requested. (However, this answer helps me with the problem I was trying to solve, so I appreciate it.) – Mr. Lance E Sloan Jul 01 '18 at 01:18
3

This answer is solving a different problem. – cs95 May 19 '19 at 06:04

score 102 · Answer 2 · answered Jan 15 '16 at 13:35

102

To add a row with column-totals:

df.loc['Total']= df.sum()

answered Jan 15 '16 at 13:35

BjoernL.

1,211
1
9
8

5

Does not give a meaningful result, if there are non-numeric columns in the df. – Holger Brandl Apr 17 '18 at 13:38
2

Incase of nan, first apply df.fillna(0) then use this sum – David Gladson Sep 23 '19 at 10:07

BhishanPoudel · Answer 3 · 2021-02-10T16:09:31.937

50

Get Both Column Total and Row Total

This gives total on both rows and columns:

import numpy as np
import pandas as pd


df = pd.DataFrame({'a': [10,20],'b':[100,200],'c': ['a','b']})

df.loc['Column_Total']= df.sum(numeric_only=True, axis=0)
df.loc[:,'Row_Total'] = df.sum(numeric_only=True, axis=1)

print(df)

                 a      b    c  Row_Total
0             10.0  100.0    a      110.0
1             20.0  200.0    b      220.0
Column_Total  30.0  300.0  NaN      330.0

edited Feb 10 '21 at 16:09

answered Jun 23 '19 at 02:08

BhishanPoudel

15,974
21
108
169

solved it for me. Is there a YouTube video I can watch that explains how this works. Just reading the documentation is not the best when you're a visual learner like me. – JQTs Apr 04 '22 at 16:00
This worked for me, but only after deleting `.loc`. Not sure why. – Sam R Aug 10 '22 at 17:22

score 8 · Answer 4 · answered Dec 27 '13 at 17:39

One way is to create a DataFrame with the column sums, and use DataFrame.append(...). For example:

import numpy as np
import pandas as pd
# Create some sample data
df = pd.DataFrame({"A": np.random.randn(5), "B": np.random.randn(5)}) 
# Sum the columns:
sum_row = {col: df[col].sum() for col in df}
# Turn the sums into a DataFrame with one row with an index of 'Total':
sum_df = pd.DataFrame(sum_row, index=["Total"])
# Now append the row:
df = df.append(sum_df)

score 5 · Answer 5 · edited Mar 27 '14 at 19:37

I have done it this way:

df = pd.concat([df,pd.DataFrame(df.sum(axis=0),columns=['Grand Total']).T])

this will add a column of totals for each row:

df = pd.concat([df,pd.DataFrame(df.sum(axis=1),columns=['Total'])],axis=1)

It seems a little annoying to have to turn the Series object (or in the answer above, dict) back into a DataFrame and then append it, but it does work for my purpose.

It seems like this should just be a method of the DataFrame - like pivot_table has margins.

Perhaps someone knows of an easier way.

score 5 · Answer 6 · edited May 08 '16 at 01:36

5

You can use the append method to add a series with the same index as the dataframe to the dataframe. For example:

df.append(pd.Series(df.sum(),name='Total'))

edited May 08 '16 at 01:36

Alex Huszagh

13,272
3
39
67

answered May 08 '16 at 00:26

yihui.dev

602
8
10

@Alexander Huszagh, what about both row and col total? – alancalvitti Feb 12 '19 at 20:20
If you want to add to the end of your method chain do this: `.pipe(lambda df: df.append(pd.Series(df.sum(), name='Total')))` – Levi Baguley Apr 26 '21 at 17:43

score 1 · Answer 7 · edited Dec 13 '19 at 10:18

1

Calculate sum and convert result into list(axis=1:row wise sum, axis=0:column wise sum)
Add result of step-1, to the existing dataFrame with new name

new_sum_col = list(df.sum(axis=1))
df['new_col_name'] = new_sum_col

edited Dec 13 '19 at 10:18

FBruzzesi

6,385
3
15
37

answered Dec 13 '19 at 09:45

Biplab Malakar

750
6
11

Adrien Pacifico · Answer 8 · 2020-07-08T12:12:57.277

0

I did not find the modern pandas approach! This solution is a bit dirty due to two chained transposition, I do not know how to use .assign on rows.

# Generate DataFrame
import pandas as pd
df = pd.DataFrame({'a': [10,20],'b':[100,200],'c': ['a','b']})

# Solution
df.T.assign(Total = lambda x: x.sum(axis=1)).T

output:

    a    b  c  Total
0  10  100  a    110
1  20  200  b    220

edited Jul 08 '20 at 12:12

answered Jun 10 '20 at 11:27

Adrien Pacifico

1,649
1
15
33

score 0 · Answer 9 · answered Feb 01 '22 at 13:54

For those that have trouble because the result is 0 or NaN, check dtype first.

df.dtypes

Since sum can only process numeric try to change the type of your dataframe first. In this example, chang to int32 for integer.

df = df.astype('int32')
df.dtypes

Then, you should be able to sum across row and add new column (as the accepted answer, not the question).

df['sum']= df.sum(numeric_only=True,axis=1)

Bonus: Sort the sum column

df.sort_values(by=['sum'])

Appending column totals to a Pandas DataFrame

9 Answers9

Get Both Column Total and Row Total

Linked

Related

Appending column totals to a Pandas DataFrame

9 Answers9

** Get Both Column Total and Row Total **

Linked

Related

Get Both Column Total and Row Total