1

Situation I have a dataframe that is used as input for several functions each of which should return a copy of the input dataframe with the data modified according to the function.

Question How do I set up the functions so as to not modify the original dataframe (ie. the input dataframe) when running the functions?

Example

df_input = pd.DataFrame({'a': [1,2,3], 'b': [4,5,6]})
df_input

    a   b
0   1   4
1   2   5
2   3   6
def new_func(df):
    df_out = df
    df_out['new'] = 'C'
    return df_out
df_output = new_func(df_input)
df_output

    a   b   new
0   1   4   C
1   2   5   C
2   3   6   C
df_input

    a   b   new
0   1   4   C
1   2   5   C
2   3   6   C

Desired state is to have only df_output have the added column.

It's probably very straight-forward but any pointers or suggestions would be much appreciated!

Zaphod
  • 147
  • 7

1 Answers1

2

You need to make explicitly a copy of df, a = b doesn't make a copy in python:

def new_func(df):
    df_out = df.copy()
    df_out['new'] = 'C'
    return df_out

Note that if you really want to add a new column, simply use assign:

df_output = df_input.assign(new='C')
mozway
  • 194,879
  • 13
  • 39
  • 75
  • Got it! Thanks! The reason why `df.copy()` would not be needed if, for instance, we did `df_out = df.iloc[:,1:]`is that `df.iloc`itself returns a copy (of a slice), correct? – Zaphod Mar 24 '23 at 08:00
  • I should add, the reason I ask is this answer [here](https://stackoverflow.com/questions/27673231/why-should-i-make-a-copy-of-a-data-frame-in-pandas) which suggests making an explicit copy even when setting up a slice. – Zaphod Mar 24 '23 at 08:05
  • 1
    Indeed, `(i)loc` is making a copy, but it's getting more tricky there as slicing sometimes returns a view. For example `df2 = df ; df2 is df` -> `True`, but `df2 = df[['col']] ; df2 is df` -> `False`, nevertheless if you modify `df2` this will also affect `df` as `df2` is a view of `df` (`df2._is_view` -> `True`) – mozway Mar 24 '23 at 08:05
  • 1
    `assign` also always returns a copy – mozway Mar 24 '23 at 08:09