I have two pandas dataframe : df1 and df2. df1 contains multiple emails of the customer and I want to match it with df2 to see how many customer did a test with the company by looking at if any of the emails is present in df1 is in df2.
I tried to do .str.split(";", expand=True)
to split the email ids and use pd.merge
to join on multiple email ids but it's too lengthy. Posting it here to find a better solution.
df1
myid emails price
1001 pikachu@icloud.com;charizard@gmail.com 1
1002 bulbasaur@gmail.com 2
1003 meowth@gmail.com;james@yahoo.com;jesse@yahoo.com;wobbuffet@yahoo.com 8
1004 abra@gmail.com;ash@yahoo.com 7
1005 squirtle@gmail.com 9
df2
tr_id latest_em test
101 pichu@icloud.com; paul@gmail.com 12
102 ash@yahoo.com 13
103 squirtle@gmail.com 16
104 charmander@gmail.com 18
105 ash@yahoo.com;misty@yahoo.com 10
Expected Output :
myid emails price tr_id latest_em test
1004 abra@gmail.com;ash@yahoo.com 7 102 ash@yahoo.com 13
1004 abra@gmail.com;ash@yahoo.com 7 105 ash@yahoo.com;misty@yahoo.com 10
1005 squirtle@gmail.com 9 103 squirtle@gmail.com 16