1

Here is what my two dataframes look like:

DF1

NAME   EMAIL                ID
Mark   mark@gmail.com      8974
Sam    sam@gmail.com       9823
June   june@gmail.com      0972
David  david@gmail.com     2143

DF2

ID     ROLE-ID
2143      22
0972      34
8974      98
9823      54

What I need to help doing:

I need to COMPARE the ID column for both dataframes and if the ID from DF1 MATCHES with the ID of DF2, I need to replace the ID column in DF1 with the respective ROLE-ID from DF2.

The output would look like this:

Updated DF1

NAME   EMAIL               ROLE-ID
Mark   mark@gmail.com        98
Sam    sam@gmail.com         54
June   june@gmail.com        34
David  david@gmail.com       22

I am using the Pandas library and tried the merge function with conditions but it didnt work

    print(pd.merge(df1, df2, on=(df1['Id'] == df2[])))
abudda
  • 11
  • 3

3 Answers3

0

Try:

df = df1.merge(df2, on='ID', how='left')
df[['NAME', 'EMAIL', 'ROLE-ID']]

It gives the following:

Screenshot

0

You did not exactly state what should happen if id is not found or is avail multiple times this may not be 100% what you want. It will leave the id untouched then.B ut guess otherwise its what you wanted.

import pandas as pd
import numpy as np

df1 = pd.DataFrame([[1,'a'],
                    [7,'b'],
                    [3,'e'],
                    [2,'c']], columns=['id', 'name'])

df2 = pd.DataFrame([[1,2],
                    [3,8],
                    [2,10]], columns=['id', 'role'])

# collect roles
roles = []
for id in df1.loc[:, 'id']:
    indices = df2.loc[:,'id'] == id
    if np.sum(indices) == 1:
        roles.append(df2.loc[indices, 'role'].iloc[0])
    else:
        # take id if role id is not given
        roles.append(id)  # could also be None if not wanted

# add role id col
df1.loc[:,'role-id'] = roles

# delete old id
del df1['id']

print(df1)

DF1:

   id name
0   1    a
1   7    b
2   3    e
3   2    c

DF2:

   id  role
0   1     2
1   3     8
2   2    10

Output

  name  role-id
0    a        2
1    b        7
2    e        8
3    c       10
Ric Hard
  • 599
  • 4
  • 12
-1

Seems like a merge problem

pd.merge(df2, df1, how='inner')
stahamtan
  • 848
  • 6
  • 10