I feel like there must be a more pythonic way (ie: easier and more straightforward) to change column values in the dataframe I am working with. Basically, I am trying to edit the values of a column match
based on values of the 'ID' column.
Take this example:
data = [['tom', 10, 111], ['nick', 15, 112], ['juli', 14, 113], ['mary', 17, 114]]
# Create the pandas DataFrame
df = pd.DataFrame(data, columns = ['Name', 'Age', 'ID'])
I have a simple dataframe, df
Now I make several slices of the dataframe
df2 = df.loc[df['ID'] == 111]
df3 = df.loc[df['ID'] == 112]
df4 = df.loc[df['ID'] == 113]
df5 = df.loc[df['ID'] == 114]
What I want to do is make a new column in my original dataframe (called 'match'). Then I want to compare df2
,df3
,df4
,df5
to it, based on the ID column. In the 'match' column I will record when those matches occurred. Let me step through my process.
If I do this...
df['match_checker'] = df2['ID'].isin(df['ID'])
df.loc[df['match_checker'] == True, 'match'] = 'Round 1'
df['match_checker'] = df3['ID'].isin(df['ID'])
df.loc[df['match_checker'] == True, 'match'] = 'Round 2'
df['match_checker'] = df4['ID'].isin(df['ID'])
df.loc[df['match_checker'] == True, 'match'] = 'Round 3'
df['match_checker'] = df5['ID'].isin(df['ID'])
df.loc[df['match_checker'] == True, 'match'] = 'Round 4'
The resulting dataframe looks like this. This is the desired outcome. (the match_checker column will change for each iteration).
Name Age ID match_checker match
0 tom 10 111 NaN Round 1
1 nick 15 112 NaN Round 2
2 juli 14 113 NaN Round 3
3 mary 17 114 True Round 4
I have the desired outcome, but creating a subset of the dataframe, then comparing it to the original dataframe seems like a bad way to do it.
Note I'm not looking for the following solution:
df.loc[df['ID'] == 111), 'match'] = 'Round 1'