1

My Data frame has several columns,

Id Name age
1 John Camp 25
2 Richard 35
3 John Randell 29
4 John Cameron 26
5 John Jacob 28
6 Oliver 24
7 Eugene 22
8 John Camp 21
9 John Clark 29
10 John Rick 21

Intended Output:

Id Name age
1 John Camp 25
2 Richard 35
4 John Cameron 26
6 Oliver 24
7 Eugene 27
8 John Camp 25
10 John Rick 21

I need to delete those names starting with John who are not (Camp,Cameron,Rick) while keeping the rest of the names.

  1. John Camp
  2. John Cameron
  3. John Rick

I want to remove only those Johns who aren't in the given three names above, while keeping other names.

Can someone help me with this?

The Wolfie
  • 31
  • 4
  • 1
    You could split names and drop all johns where lastname not in ('Rick', 'Cameron', 'Camp'), Syntax as in https://stackoverflow.com/questions/18172851/deleting-dataframe-row-in-pandas-based-on-column-value – Dschoni Feb 17 '21 at 09:45

1 Answers1

0

Use Series.str.endswith wit htuple of surname chained by | for bitwise OR with Series.str.startswith and inverted mask by ~ and filter in boolean indexing:

sur = ['Camp', 'Cameron', 'Rick']

df = df[df['Name'].str.endswith(tuple(sur)) | ~df['Name'].str.startswith('John')]
print (df)
   Id          Name  age
0   1     John Camp   25
1   2       Richard   35
3   4  John Cameron   26
5   6        Oliver   24
6   7        Eugene   22
7   8     John Camp   21
9  10     John Rick   21
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252