0

I have two dataframes like

df1

sub_id     Weight
1          56
2          67
3          81
5          73
9          59

df2

sub_id     Text
1          He is normal.
1          person is healthy.
1          has strong immune power.
3          She is over weight.
3          person is small.
9          Looks good.
5          Not well.
5          Need to be tested.

By combining these two data frame i need to get as (when there are multiple sub_id's in second df need to pick first text and combine with first df as below)

merge_df

sub_id   Weight    Text
1        56        He is normal.
2        67        Nan.
3        81        She is over weight.
5        73        Not well.
9        59        Looks good.

Can anyone help me out? Thanks in advance.

cs95
  • 379,657
  • 97
  • 704
  • 746
Rahul
  • 11
  • 3

1 Answers1

0

Here you go:

print(pd.merge(df1, df2.drop_duplicates(subset='sub_id'),
         on='sub_id',
         how='outer'))

Output

   sub_id  Weight                 Text
0       1      56        He is normal.
1       2      67                  NaN
2       3      81  She is over weight.
3       5      73            Not well.
4       9      59          Looks good.

To keep the last duplicate, you'd use the parameter keep='last'

print(pd.merge(df1, df2.drop_duplicates(subset='sub_id', keep='last'),
         on='sub_id',
         how='outer'))

Output

   sub_id  Weight                      Text
0       1      56  has strong immune power.
1       2      67                       NaN
2       3      81          person is small.
3       5      73        Need to be tested.
4       9      59               Looks good.

Balaji Ambresh
  • 4,977
  • 2
  • 5
  • 17
  • Thank you for the answer.... I have other question.. what if i want to pick the last duplicate entry instead of first .... for example for the sub_id 1 i have to get Text as has strong immune power – Rahul Jul 25 '20 at 18:24
  • @RahulThangellapally updated the post to keep the last duplicate – Balaji Ambresh Jul 25 '20 at 18:30