workclass = X_train[~X_train['workclass'].isnull()]['workclass'].unique()
for dataset in [X_train, X_test]:
df = dataset[dataset['workclass'].isnull()].index
size = len(df)
s = pd.Series([workclass[np.random.randint(0, 8)] for _ in range(size)], index=df, dtype=object)
dataset.loc[:, 'workclass'] = dataset.loc[:, 'workclass'].fillna(s)
Output
S:\AnacondaPF\lib\site-packages\pandas\core\indexing.py:965: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
The last line is giving me SettingWithCopyWarning
even if i use the .loc
method.
Even it is giving the warning it has filled all the missing values in the two datasets.
Can anyone explanin why?