2

I have 2 dataframes:

df1 = pd.DataFrame(
[
  (73, 15, 'update1', 1, 'foo'),
  (63, 64, 'update2', 2, 'bar'),
  (56, 72, 'update3', 3, 'foo'),
],
columns=['A', 'B', 'C', 'D', 'E'],
)

and

df2 = pd.DataFrame(
[
  (73, 15, 'new1', 2, 'foo'),
  (63, 64, 'new2', 3, 'bar'),
  (56, 72, 'new3', 1, 'foo'),
],
columns=['A', 'B', 'C', 'D', 'E'],
)

I am looking for a way to join these 2 dataframes on column D and then update the column C of df1 so as to match the value of column C in df2.

Final Result:

df1 = pd.DataFrame(
[
  (73, 15, 'new3', 1, 'foo'),
  (63, 64, 'new1', 2, 'bar'),
  (56, 72, 'new2', 3, 'foo'),
],
columns=['A', 'B', 'C', 'D', 'E'],
)

Any help is appreciated.

What I have tried so far?

Run a for loop and update the values in df1 by finding the corresponding value of column c in df2.

ChinwalPrasad
  • 329
  • 2
  • 11

1 Answers1

2

One way is to use map:

df1['C'] = df1.D.map(df2.set_index('D').C)

df1
    A   B     C  D    E
0  73  15  new3  1  foo
1  63  64  new1  2  bar
2  56  72  new2  3  foo

where df2.set_index('D').C returns the mapping you needed from D to C in df2:

df2.set_index('D').C

D
2    new1
3    new2
1    new3
Name: C, dtype: object
Psidom
  • 209,562
  • 33
  • 339
  • 356