91

I have a dataframe with values like

A B
1 4
2 6
3 9

I need to add a new column by adding values from column A and B, like

A B C
1 4 5
2 6 8
3 9 12

I believe this can be done using lambda function, but I can't figure out how to do it.

n00b
  • 1,549
  • 2
  • 14
  • 33

11 Answers11

135

Very simple:

df['C'] = df['A'] + df['B']
DeepSpace
  • 78,697
  • 11
  • 109
  • 154
  • 50
    I get the following warning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead – n00b Dec 01 '15 at 15:45
  • Running __version__ gives me '0.16.2' – n00b Dec 01 '15 at 15:51
  • 1
    I get the same warning with version: 3.7.4 (default, Aug 9 2019, 18:34:13) [MSC v.1915 64 bit (AMD64)] – spec3 Nov 26 '19 at 20:03
  • @spec3 https://stackoverflow.com/questions/20625582/how-to-deal-with-settingwithcopywarning-in-pandas – DeepSpace Nov 26 '19 at 20:49
  • I'm having issue to display a fourth column.I have `a,b,c ` columns already. But when I try to make a sum of `b + c = d `I got an `AttributeError: 'DataFrame' object has no attribute c`. What would be the issue? – Elias Prado Jan 29 '21 at 02:43
  • @n00b: Try explicitly copying the DataFrame before running assignments on it: `df = df.copy(); df['C'] = df['A'] + df['B']` – Martijn Courteaux Nov 07 '21 at 09:06
78

Building a little more on Anton's answer, you can add all the columns like this:

df['sum'] = df[list(df.columns)].sum(axis=1)
sparrow
  • 10,794
  • 12
  • 54
  • 74
  • 2
    I can't believe there are not many upvotes for this answer. This is the only one where you don't need to type column names individually to get the sum! Thanks @sparrow! – Geek Dec 24 '17 at 17:10
  • 10
    you could drop `list(df.columns)` as it's redundant here. So final code should look like `df['sum'] = df.sum(axis=1)` – Anton Protopopov Jun 14 '18 at 09:29
56

The simplest way would be to use DeepSpace answer. However, if you really want to use an anonymous function you can use apply:

df['C'] = df.apply(lambda row: row['A'] + row['B'], axis=1)
efajardo
  • 797
  • 4
  • 9
37

You could use sum function to achieve that as @EdChum mentioned in the comment:

df['C'] =  df[['A', 'B']].sum(axis=1)

In [245]: df
Out[245]: 
   A  B   C
0  1  4   5
1  2  6   8
2  3  9  12
Anton Protopopov
  • 30,354
  • 12
  • 88
  • 93
17

You could do:

df['C'] = df.sum(axis=1)

If you only want to do numerical values:

df['C'] = df.sum(axis=1, numeric_only=True)

The parameter axis takes as arguments either 0 or 1, with 0 meaning to sum across columns and 1 across rows.

Manuel Martinez
  • 798
  • 9
  • 14
16

As of Pandas version 0.16.0 you can use assign as follows:

df = pd.DataFrame({"A": [1,2,3], "B": [4,6,9]})
df.assign(C = df.A + df.B)

# Out[383]: 
#    A  B   C
# 0  1  4   5
# 1  2  6   8
# 2  3  9  12

You can add multiple columns this way as follows:

df.assign(C = df.A + df.B,
          Diff = df.B - df.A,
          Mult = df.A * df.B)
# Out[379]: 
#    A  B   C  Diff  Mult
# 0  1  4   5     3     4
# 1  2  6   8     4    12
# 2  3  9  12     6    27
steveb
  • 5,382
  • 2
  • 27
  • 36
4

Concerning n00b's comment: "I get the following warning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead"

I was getting the same error. In my case it was because I was trying to perform the column addition on a dataframe that was created like this:

df_b = df[['colA', 'colB', 'colC']]

instead of:

df_c = pd.DataFrame(df, columns=['colA', 'colB', 'colC'])

df_b is a copy of a slice from df
df_c is an new dataframe. So

df_c['colD'] = df['colA'] + df['colB']+ df['colC']

will add the columns and won't raise any warning. Same if .sum(axis=1) is used.

firefly
  • 301
  • 2
  • 7
4

Can do using loc

In [37]:  df = pd.DataFrame({"A":[1,2,3],"B":[4,6,9]})

In [38]: df
Out[38]:
   A  B
0  1  4
1  2  6
2  3  9

In [39]: df['C']=df.loc[:,['A','B']].sum(axis=1)

In [40]: df
Out[40]:
   A  B   C
0  1  4   5
1  2  6   8
2  3  9  12
Roushan
  • 4,074
  • 3
  • 21
  • 38
3

I wanted to add a comment responding to the error message n00b was getting but I don't have enough reputation. So my comment is an answer in case it helps anyone...

n00b said:

I get the following warning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead

He got this error because whatever manipulations he did to his dataframe prior to creating df['C'] created a view into the dataframe rather than a copy of it. The error didn't arise form the simple calculation df['C'] = df['A'] + df['B'] suggested by DeepSpace.

Have a look at the Returning a view versus a copy docs.

tgraybam
  • 160
  • 1
  • 2
  • 9
0

eval lets you sum and create columns right away:

In [8]: df.eval('C = A + B', inplace=True)

In [9]: df
Out[9]: 
   A  B   C
0  1  4   5
1  2  6   8
2  3  9  12

Since inplace=True you don't need to assign it back to df.

rachwa
  • 1,805
  • 1
  • 14
  • 17
-2

You can solve it by adding simply: df['C'] = df['A'] + df['B']

  • why is this answer voted down? Does it not work? Or is the performance to bad? Please give some more hints. – tangoal May 13 '23 at 08:43