I have two DataFrames: train_df
and test_df
and I store them in a list: combine = [train_df, test_df]
. Both DFs have a column named "Gender", which is either "male" or "female". Now I want to modify that column in both DFs so that "male" is replaced with 0 and "female" with 1. I used the following code:
for dataset in combine:
dataset["Gender"] = dataset["Gender"].map({"female": 1, "male": 0})
I noticed that it modified train_df
and test_df
, as well as both combine
elements. Why is that? I thought that dataset
here is a looping variable (so it stores just a local copy of a DF) and nothing will change (think Apply a for loop to multiple DataFrames in Pandas). And more generally, is it even appropriate to access DF columns in a loop like this (when we have multiple DFs)? Is there a more Pythonic way?