0

I copies a dataframe , and then add a column to copied one dataframe , but this will lead to add column to orignal dataframe.

X_train_1 = X_train
X_train_1["class_label"] = y_train
print(X_train.columns)
Annu Roy
  • 1
  • 1

3 Answers3

0

As stated here, you need to copy the dataframe. Check this minimal sample:

import pandas as pd

X_train = pd.DataFrame([{'a': 1, 'b': 2}, {'a': 2, 'b': 3}, {'a': 3, 'b': 4}, {'a': 4, 'b': 5}])
X_train_1 = X_train.copy()
print(X_train_1)
X_train_1["class_label"] = ['one', 'two', 'three', 'four']
print(X_train)
AlexGuevara
  • 932
  • 11
  • 28
0

When you write

X_train_1 = X_train

It basically assign the variable by reference ant not by value. So whatever change you have make to new variable it actually modify the original. Same behaviour you will observe if you try doing this with lists for example. As suggested by others make a copy using

X_train_1 = X_train.copy().

SUN
  • 181
  • 5
0

while copying a dataframe, you should use copy method to copy the dataframe rather than just assigining new dataframe. The following code won't lead to any modification in the original dataframe.

X_train_1 = X_train.copy()
X_train_1["class_label"] = y_train
print(X_train.columns)
Yashi Aggarwal
  • 407
  • 2
  • 6