1

I have a dataframe that has two columns: DNI, Email.

And I have another one that has: first name, last name, num

This is the data structure:

dataframe 1:

  DNI   email
. 1     Name1.lastname1@domain.com
. 525   Name2.lastname2@domain.com
. 665   Name3.lastname3@domain.com

dataframe 2:

  first name  last name  num
. name2       lastname2  8658685
. name1       lastname1  1131222

I want to add the num column to the first dataframe depending on the mail and if the name and last name combination does not exist for the email column I want to add "0" value and it looks like this:

  DNI   email                        num
. 1     Name1.lastname1@domain.com  1131222
. 525   Name2.lastname2@domain.com  8658685
. 665   Name3.lastname3@domain.com  0

I'm not sure what is the correct way to do this... I'm thinking to do this using for loops, adding values to a dictionary depending on some conditionals but this logic is inefficient with large Dataframes

any idea to do this in a better way?

Thanks

Reco Jhonatan
  • 1,503
  • 4
  • 23
  • 35

2 Answers2

1

You can follow these steps :

  1. Create a new column "email" in dataframe2 by concatenating first_name, last_name and "domain.com" .

    dataframe2["email"] = dataframe2["first_name"]+"."+dataframe2["last_name"]+ "@domain.com"

Make any other string changes that are required (as per your data) such that this email format exactly matches with the email in dataframe1.

  1. Now, left join dataframe1 and dataframe2 via

    result = dataframe1.merge(dataframe2, on='email', how='left')

  2. Finally remove NaN's from the "num" column and replace it with 0.

    result['num'] = result['num'].fillna(0)

You can edit the query or the result dataframe to remove extra columns generated.

segfault404
  • 281
  • 1
  • 11
0

Extract name and last name from df1 and add as columns

Merge df1 with df2 on name & last name

RichieV
  • 5,103
  • 2
  • 11
  • 24