1

hello i have a dataframe :

name; id ; firstname ;lastname
MD ALEXIA DORTMINEX ; 1; ALEXIA ; DORTMINEX
DOC PAULO RODRIGEZ ; 3 ; PAOLO ; SANCHEZ

i want to keep only rows if name contains lastname (i.e lastname is in name)

in our case , we keep only:

name; id ; firstname ;lastname
MD ALEXIA DORTMINEX ; 1; ALEXIA ; DORTMINEX

because DORTMINEX is in MD ALEXIA DORTMINEX

thnks

alex Maia
  • 13
  • 2
  • Does this answer your question? [How do I select rows from a DataFrame based on column values?](https://stackoverflow.com/questions/17071871/how-do-i-select-rows-from-a-dataframe-based-on-column-values) – JonSG Dec 17 '21 at 15:10

3 Answers3

0

You can use apply and slicing:

df[df.apply(lambda r: r['lastname'] in r['name'], axis=1)]

output:

                  name  id firstname   lastname
0  MD ALEXIA DORTMINEX   1    ALEXIA  DORTMINEX
mozway
  • 194,879
  • 13
  • 39
  • 75
0

You can check whether your lastname column contains your name column using a list comprehension which will return a boolean (True / False). Placing it within loc will filter your dataframe using the resulting boolean, which will give you what you require:

>>> [name[0] in name[1] for name in zip(df['lastname'], df['name'])]

[True, False]

>>> df.loc[[name[0] in name[1] for name in zip(df['lastname'], df['name'])]]

                   name   id   firstname     lastname
0  MD ALEXIA DORTMINEX      1     ALEXIA    DORTMINEX
sophocles
  • 13,593
  • 3
  • 14
  • 33
0

You can check for each row that lastname is in name with the apply() function and then filter your data using this mask.

As follows:

mask = df.apply(lambda x: x['lastname'] in x['name'], axis=1)
df = df[mask]

This will Output:

                   name   id   firstname     lastname
0  MD ALEXIA DORTMINEX      1     ALEXIA    DORTMINEX
Antoine Dubuis
  • 4,974
  • 1
  • 15
  • 29