def prep_data(embeddings, df):
"""
Prepares training and test set the train and test data used.
:return: train and test set
"""
#Original_DF
OD_df = df.reset_index()
OD_df.loc[:,'type'] = 'nails'
embeddings_df = pd.DataFrame(data=embeddings)
embeddings_df = embeddings_df.reset_index()
embedded_df = pd.merge(embeddings_df, OD_df, on='index')
train, valid = train_test_split(embedded_df, test_size=0.2, random_state=42, shuffle=True)
#train_col = ['train'] * len(train)
#train['split']= train_col
train.loc[:,'split'] = 'train'
#valid_col = ['valid'] * len(valid)
#valid['split']= valid_col
valid.loc[:,'split'] = 'valid'
x_train, y_train = train.iloc[:, :3840], train.loc[:,'nails']
x_valid, y_valid = valid.iloc[:, :3840], valid.loc[:,'nails']
return x_train, y_train, x_valid, y_valid
Calling Code
x_train, y_train, x_valid, y_valid = prep_data(embeddings, df)
I get the following warning. I've tried implemented the .iloc
method and .loc
as well, even just normal subsetting without the previous but it continues to give me the error message. When I revise the dataframes, everything seems to be in order, but I'm not sure on how to suppress or overcome this warning.
warning
/home/maria/my_python_env/lib/python3.7/site-packages/pandas/core/indexing.py:1596: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy