Adding a column to a Pandas DataFrame as a copy

Question

I have a pandas DataFrame called original and I would like to add a new column to it and save the resultant DataFrame in a variable called modified. How do I do that?

import pandas as pd
import numpy as np
original = pd.DataFrame(np.random.randn(5, 2), columns=['a', 'b'])

The solution given in the very similarly named questions here is to do something like:

original['c'] = original['b'].abs()

This does not work for me because it modifies the original DataFrame. A potential solution is to use join, but that does not allow me to name it nor does it allow it be filled with a scalar values:

modified = original.join(original['b'].abs(),rsuffix='_abs')

The aim is to able to add the column in a single line without temp variables to achieve the following effect:

modified = original.some_op() \
    .a_different_op() \
    .add_a_column() \ # <- the step I can't figure out
    .another_op() \
    .final_op()

Copy first then add? `modified = original.copy(); modified['c'] = ...` — Viktor Kerkez, Sep 10 '13 at 15:11
Why not just use a temporary variable and rename it and/or fill it? — Phillip Cloud, Sep 10 '13 at 15:12
The why is simple. The above style avoids creating new intermediate identifiers that would be immediately discarded and makes complex data transformations easier to follow. — Roger, Sep 10 '13 at 15:26
I mean used once, and never touched again. @PhillipCloud, thanks btw for contributing to pandas. — Roger, Sep 11 '13 at 17:25

score 5 · Answer 1 · answered Feb 25 '16 at 13:56

5

Use pandas.DataFrame.assign method it is described here http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.assign.html

answered Feb 25 '16 at 13:56

wirrbel

3,173
3
26
49

Adding a column to a Pandas DataFrame as a copy

1 Answers1

Linked