Say I have a selection of rows from dataframe stored in a variable errorData. When I display this variable, the correct rows are shown (i.e. selection appears to be valid). My goal is to replace only the rows that match the criteria in my variable to np.nan
errorData = df.loc[(df['Percent'] == 100) &\
(df['Rating1'] != 8) &\
(df['Rating2'] != 1)&\
(df['Grade'] == "NG")]
for i in errorData:
df['Percent'].replace(df['Percent']==100, np.nan,inplace=True)
However, this doesn't appear to be working. Whenever I report the percent column again after performing this operation, values with 100 were also removed from
df['Grade'] == "B"
I've tried a couple of other ways too, like:
for i in errorData:
df['Percent'].replace(100, np.nan,inplace=True)
But again, to no avail. Sorry I haven't posted sample rows here. I've seen that done on other questions but I'm not entirely sure on the formatting of that.
Apologies in advance for any errors in the above.
Update: For more clarification if I execute
df.loc[(df['Percent'] == 100) &\
(df['Rating1'] != 8) &\
(df['Rating2'] != 1)&\
(df['Grade'] == "NG")].shape
It returned (129,8) -- i.e. my valid rows.
And if I perform
df['Percent'].isnull().sum()
Before the change, I'll receive 0, but after the change I'll receive 400. This means it's not only editing the rows selected in my variable erroneousData and I cannot see why.