I have some code that needs to replace two columns of a pandas DataFrame with the index of each value as they appear in a unique list of those values. For example:
col1, col2, col3, col4
A, 1, 2, 3
A, 1, 2, 3
B, 1, 2, 3
Should end up in the data frame as:
col1, col2, col3, col4
0, 1, 2, 3
0, 1, 2, 3
1, 1, 2, 3
since A is element 0 in the list of unique col1 values, and B is element number 1.
What I did is:
unique_vals = df['col1'].unique()
# create a map to speed up looking indexes when we map the dataframe column
unique_vals.sort()
unique_vals_map = {}
for i in range(len(unique_vals)):
unique_vals_map[unique_vals[i]] = i
df['col1'] = df['col1'].apply(lambda r: unique_vals_map[r])
However that last line gives me:
SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
I saw other SO answers about this, but I am not sure how to fix it in my particular case. I'm experienced with numpy but I'm new to pandas, any help is greatly appreciated!
Is there a better way to perform this mapping?