I am using Pandas 0.24.2
to prepare some data for machine learning. To setup the data, I used the StandardScaler()
in scikit-learn
to normalize the features. However, I am getting this odd warning about
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
The odd thing is that I am already using the.iloc[]
method on the dataframes. Here is the code itself below. Note that x_train, x_test
are dataframes.
dat = pd.read_csv('/path/data.csv')
x_train, x_test = train_test_split(dat,
test_size=0.2,
random_state=42)
scaler = StandardScaler()
x_train.iloc[:, :-1] = scaler.fit_transform(x_train.iloc[:, :-1])
x_test.iloc[:, :-1] = scaler.transform(x_test.iloc[:, :-1])
x_mean = scaler.mean_
x_std = scaler.scale_
Can anyone figure out what the actual problem is? I avoided rescaling the last column in the dataframe because it is the label column.