0

enter image description here

Hi Everyone,

I'm using an email verification software to check if emails in my dataset are valid or not. How do I use pandas to match the email in the column named "Valid Email" to the column "Email" and return the value in "Good_Email"? To be clear the result could look like

              Good_Email

              bob_45@gmail.com
              nan
              tom34@gmail.com
              nan

Also lets assume that the column named "Valid Email" will be ordered randomly compared to the column "Email" and can be shorter in length. (i.e. Exactly as presented in the example).

Thanks in advance!

1 Answers1

2

You can use .where() together with .isin(), as follows:

df['Good_Email'] = df['Email'].where(df['Email'].isin(df['Valid Email']))

Demo

data = {'Name': ['Bob', 'Jack', 'Tom', 'Mike'],
 'Email': ['bob_45@gmail.com', np.nan, 'tom34@gmail.com', 'mike@gmail.com'],
 'Valid Email': ['tom34@gmail.com', 'bob_45@gmail.com', np.nan, np.nan]}

df = pd.DataFrame(data)

df['Good_Email'] = df['Email'].where(df['Email'].isin(df['Valid Email']))

Output:

print(df)

   Name             Email       Valid Email        Good_Email
0   Bob  bob_45@gmail.com   tom34@gmail.com  bob_45@gmail.com
1  Jack               NaN  bob_45@gmail.com               NaN
2   Tom   tom34@gmail.com               NaN   tom34@gmail.com
3  Mike    mike@gmail.com               NaN               NaN

SeaBean
  • 22,547
  • 3
  • 13
  • 25