0

I have a DataFrame and I want to genera new one changing values from just one column, and keep original dataframe intact.I have try with mask, where and iloc, but the original data frame always change.

import pandas as pd

data = {
  "age": [50, 40, 30, 40, 20, 10, 30],
  "qualified": [True, False, False, False, False, True, True]
}
df = pd.DataFrame(data)

newdf = df
newdf["age"] = newdf.where(newdf["age"] > 30,2)

print(newdf)
print(df)

Result:

age  qualified
0  50       True
1  40      False
2   2      False
3  40      False
4   2      False
5   2       True
6   2       True
  age  qualified
0  50       True
1  40      False
2   2      False
3  40      False
4   2      False
5   2       True
6   2       True

Is there some way to change this values and keep the original?

nz_J
  • 3
  • 1

1 Answers1

0

Use df.copy(deep=True) What is the difference between a deep copy and a shallow copy?

import pandas as pd
import numpy as np

data = {
  "age": [50, 40, 30, 40, 20, 10, 30],
  "qualified": [True, False, False, False, False, True, True]
}
df = pd.DataFrame(data)

# deep copy
newdf = df.copy(deep=True)


newdf["age"] = np.where(newdf["age"] > 30, newdf["age"], 2)
print(newdf)
   age  qualified
0   50       True
1   40      False
2    2      False
3   40      False
4    2      False
5    2       True
6    2       True

print(df)
   age  qualified
0   50       True
1   40      False
2   30      False
3   40      False
4   20      False
5   10       True
6   30       True
JAdel
  • 1,309
  • 1
  • 7
  • 24