0

I have two tables. First table is df_1, down below:

job_function job_area title
General Management Human Resources manager
Learning / Training IT / Computers / Electronics personnel

And second table is df_2, down below:

job_function job_area id title
General Management Human Resources 12312312 manager
Learning / Training IT / Computers / Electronics 12312237 personnel
General Management Human Resources 12312315 manager
Learning / Training IT / Computers / Electronics 12312234 personnel

I want to merge df_2 with df_1, however as you can see there is one to many relationship. So if joining job_function, job_area, title on df_1, in df_2 you will have many ids which will have the same job_function, job_area, title

BigBen
  • 46,229
  • 7
  • 24
  • 40
  • I am not sure if I understand the question. Why would you want to merge both dataframes if all columns of df_1 are already part of df_2? – Ignacio Hernández Nov 09 '22 at 16:52
  • 1
    @IgnacioHernández Sorry if it was not too clear, basically it only looks like this in example, however not all column from df_2 in df_1. And my main goal, is to have id from df_2 in df_1 – Rus Zzzeta Nov 09 '22 at 16:57

1 Answers1

1

I think this is what you want to do (both CSVs I use are identical to what you have in your question):

import pandas as pd

df_1 = pd.read_csv('document1.csv')
df_2 = pd.read_csv('document2.csv')

key_cols = ['job_function', 'job_area', 'title']
merged_df = pd.merge(df_1, df_2,  how='left', left_on=key_cols, right_on=key_cols)

Source: How to join two dataframes on multiple columns