I need to copy a column's field into a variable, based on a specific condition, and then delete it.
This dataframe contains data of some kids, that have their favourite toy and colour associated:
data = {'Kid': ['Richard', 'Daphne', 'Andy', 'May', 'Claire', 'Mozart', 'Jane'],
'Toy': ['Ball', 'Doll', 'Car', 'Barbie', 'Frog', 'Bear', 'Doll'],
'Colour': ['white', np.nan, 'red', 'pink', 'green', np.nan, np.nan]
}
df = pd.DataFrame (data, columns = ['Kid', 'Toy','Colour'])
print (df)
The dataframe looks like this:
Kid Toy Colour
0 Richard Ball white
1 Daphne Doll NaN
2 Andy Car red
3 May Barbie pink
4 Claire Frog green
5 Mozart Bear NaN
6 Jane Doll NaN
The condition is: If a kid does have a toy, but it does not have a colour, then save both the kid and the toy in a separate array as follows and maintain the order/matching:
toy_array = ["Doll", "Bear", "Doll"]
kid_array = ["Daphne", "Mozart", "Jane"]
And then delete the toy from the dataframe. So the final dataframe should look like this:
Kid Toy Colour
0 Richard Ball white
1 Daphne NaN NaN
2 Andy Car red
3 May Barbie pink
4 Claire Frog green
5 Mozart NaN NaN
6 Jane NaN NaN
I got inspired by many sources, along with this one, and I tried this:
kid_array.append(df.loc[(df['Toy'] != np.nan) & (df['Colour'] == np.nan)])
print(kid_array)
I am at the very beginning, I highly appreciate all your help if you could possibly help me!