5

I have a pandas DataFrame called original and I would like to add a new column to it and save the resultant DataFrame in a variable called modified. How do I do that?

import pandas as pd
import numpy as np
original = pd.DataFrame(np.random.randn(5, 2), columns=['a', 'b'])

The solution given in the very similarly named questions here is to do something like:

original['c'] = original['b'].abs()

This does not work for me because it modifies the original DataFrame. A potential solution is to use join, but that does not allow me to name it nor does it allow it be filled with a scalar values:

modified = original.join(original['b'].abs(),rsuffix='_abs')

The aim is to able to add the column in a single line without temp variables to achieve the following effect:

modified = original.some_op() \
    .a_different_op() \
    .add_a_column() \ # <- the step I can't figure out
    .another_op() \
    .final_op()
Roger
  • 398
  • 1
  • 4
  • 11

1 Answers1

5

Use pandas.DataFrame.assign method it is described here http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.assign.html

wirrbel
  • 3,173
  • 3
  • 26
  • 49