-1

I have a dataframe df1:

site    cell
T96976  V96976A
T96976  V96976B
T96976  V96976C
T96976  V96976O
T96980  D96980A
T96980  D96980B
T96980  U96980C
T97750  D97750N
T97750  D97750A
T97750  D97750B
T97750  V97750O
T97760  V97760A
T97760  V97760B
T97777  L97777A
T97777  U97777B
T97777  V97777C
T99989  G99989P

I want dataframe df2 such that if I find a cell ending with 'N' or 'O' or P' then I have to delete all the cells having the same site.

Hence my resultant dataframe df2 should have something like this:

site    cell
T96980  D96980A
T96980  D96980B
T96980  U96980C
T97760  V97760A
T97760  V97760B
T97777  L97777A
T97777  U97777B
T97777  V97777C
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
amrutha
  • 193
  • 1
  • 11
  • Possible duplicate of [Select rows from a DataFrame based on values in a column in pandas](https://stackoverflow.com/questions/17071871/select-rows-from-a-dataframe-based-on-values-in-a-column-in-pandas) – MattR Mar 21 '18 at 11:55

1 Answers1

0

I think need:

sites = df.loc[df['cell'].str.contains('[NOP]$'), 'site']
#alternative
#sites = df.loc[df['cell'].str[-1].isin(['N','O','P']), 'site']
df = df[~df['site'].isin(sites)]
print (df)
      site     cell
4   T96980  D96980A
5   T96980  D96980B
6   T96980  U96980C
11  T97760  V97760A
12  T97760  V97760B
13  T97777  L97777A
14  T97777  U97777B
15  T97777  V97777C

Detail:

print (sites)
3     T96976
7     T97750
10    T97750
16    T99989
Name: site, dtype: object

Explanation:

  1. First get all sites with N or O or P by filtering by contains ($ is regex for end of string)
  2. Then filter again by isin with inverted condition by ~
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252