0

Let's say I have these two dataframes:

worker_name worker_age
Alex 35
John 65
Karl 26
worker_name duties title
Simon Plumber Walmart
Alex Analyst Amazon
John Driver Uber

How can I get such a dataframe?

worker_name worker_age duties title
Alex 35 Analyst Amazon
John 65 Driver Uber
Karl 26 Nan Nan

I tried going through iteration with df.iterrows() but it takes too much time so it's not the option for my data.

JunGi
  • 15
  • 4

1 Answers1

1

You can use merge(). Example taken from documentation:

left = pd.DataFrame(
    {
        "key": ["K0", "K1", "K2", "K3"],
        "A": ["A0", "A1", "A2", "A3"],
        "B": ["B0", "B1", "B2", "B3"],
    }
)


right = pd.DataFrame(
    {
        "key": ["K0", "K1", "K2", "K3"],
        "C": ["C0", "C1", "C2", "C3"],
        "D": ["D0", "D1", "D2", "D3"],
    }
)

result = pd.merge(left, right, on="key")
  • Greetings. It did work but it also removed the keys that were not present in the right dataframes which is not the case for me unfortunately. P.S. I tried how = "left". Seems like it worked. – JunGi Feb 18 '22 at 19:00
  • The reason that keys that do not appear in both DataFrames are deleted is because by default, merge() uses how='inner', so it will perform an intersection (only keep keys that appear in both frames). You can change this behaviour by defining your own "how" parameter. The linked documentation provides more details on this. – Steven Robyns Feb 18 '22 at 19:05