1

I have a DataFrame with 3 columns that represent the same data (the name of a person). The columns are Username, BuyerName, TravelerName. In some cases, these three columns can share the same value (the username is the same person who bought a ticket and used it to travel, for instance), and I want to exclude/filter these cases. Here is my current working code:

def filter_df(df, user_name):
   filtered_df = df.query(f'Username != "{user_name}" & BuyerName != "{user_name}" & TravelerName != "{user_name}")
   return filtered_df

I'd like to know if there is a cleverer way of doing this query instead of just repeating the same value over and over the columns.

Alex Waygood
  • 6,304
  • 3
  • 24
  • 46
heresthebuzz
  • 678
  • 7
  • 21
  • 5
    `df[df[['Username','BuyerName','TravelName']].ne(user_name).all(1)]`. – Quang Hoang Jul 19 '21 at 19:37
  • 1
    [This answer](https://stackoverflow.com/q/15315452/6340496) provides a clean/readable/standard method (from a pandas perspective). Although, it must be said that @QuangHoang comment above is beautiful! – S3DEV Jul 19 '21 at 19:43
  • Thank you! A question that extends the original problem, can this solution be applied as well to the case of filtering rows when `user_name` appears at least once among the three columns? – heresthebuzz Jul 20 '21 at 12:31

2 Answers2

0

You could use advanced string formatting as explained here, so you don't have to repeat the user_name variable name over and over again: Inserting the same value multiple times when formatting a string

jpaodev
  • 136
  • 5
0

You could try using .format():

def filter_df(df, user_name):
   return df.query('Username != "{0}" & BuyerName != "{0}" & TravelerName != "{0}"'.format(user_name))
Jaeden
  • 332
  • 2
  • 4
  • 16