I copies a dataframe , and then add a column to copied one dataframe , but this will lead to add column to orignal dataframe.
X_train_1 = X_train
X_train_1["class_label"] = y_train
print(X_train.columns)
As stated here, you need to copy the dataframe. Check this minimal sample:
import pandas as pd
X_train = pd.DataFrame([{'a': 1, 'b': 2}, {'a': 2, 'b': 3}, {'a': 3, 'b': 4}, {'a': 4, 'b': 5}])
X_train_1 = X_train.copy()
print(X_train_1)
X_train_1["class_label"] = ['one', 'two', 'three', 'four']
print(X_train)
When you write
X_train_1 = X_train
It basically assign the variable by reference ant not by value. So whatever change you have make to new variable it actually modify the original. Same behaviour you will observe if you try doing this with lists for example. As suggested by others make a copy using
X_train_1 = X_train.copy().
while copying a dataframe, you should use copy method to copy the dataframe rather than just assigining new dataframe. The following code won't lead to any modification in the original dataframe.
X_train_1 = X_train.copy()
X_train_1["class_label"] = y_train
print(X_train.columns)