1

I have a pandas data frame new and want to drop rows if the column Name contains certain values specified in a list lst.

Original DF:

        Name         
0  Jack Wang 
1  Jack Lee 
2  Mabel Smith
3  Amber Golden

lst = ['Wang', 'Smith']

Desired DF:

        Name   
1  Jack Lee
2  Amber Golden 

Below is my code:

for i in range(len(new)):
    if any(elem in lst for elem in new['Name'][i].split()) == True:
        new2 = new.drop([i])

However, it could identify rows with Name column containing values in lst, but the rows could not be dropped.

Thanks in advance!

MAMS
  • 419
  • 1
  • 6
  • 17
  • 2
    `new.loc[~new['Name'].isin(lst)]` ? please share sample data, as well as desired output – help-ukraine-now Aug 04 '19 at 18:06
  • Your issue is that `new2` only reflects changes on the last iteration. Each iteration refers to the original `new`. Dropping on the fly like this changes the shape of the dataframe anyway so is not the best approach. @politicalscientist shows a better approach. – busybear Aug 04 '19 at 18:12
  • Thanks for your input! I edited my question with the original dataset and desired outcome, basically I want to remove the rows that have Name values containing values in lst after split Name values. Could you please suggest? Thanks – MAMS Aug 04 '19 at 18:30
  • @MAMS, thanks for the update. Now I see that `isin` wouldn't work in this case. instead, try `new.loc[new['Name'].str.contains('|'.join(lst))]` – help-ukraine-now Aug 04 '19 at 18:46
  • If you want to learn more, check out: [`pandas.Series.str.contains`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.str.contains.html), as well as [this answer](https://stackoverflow.com/a/26577689/10140310) – help-ukraine-now Aug 04 '19 at 18:49

0 Answers0