How to add multiple columns to an existing dataframe?

Question

Let's say I have these two dataframes:

worker_name	worker_age
Alex	35
John	65
Karl	26

worker_name	duties	title
Simon	Plumber	Walmart
Alex	Analyst	Amazon
John	Driver	Uber

How can I get such a dataframe?

worker_name	worker_age	duties	title
Alex	35	Analyst	Amazon
John	65	Driver	Uber
Karl	26	Nan	Nan

I tried going through iteration with df.iterrows() but it takes too much time so it's not the option for my data.

score 1 · Answer 1 · answered Feb 18 '22 at 18:53

1

You can use merge(). Example taken from documentation:

left = pd.DataFrame(
    {
        "key": ["K0", "K1", "K2", "K3"],
        "A": ["A0", "A1", "A2", "A3"],
        "B": ["B0", "B1", "B2", "B3"],
    }
)


right = pd.DataFrame(
    {
        "key": ["K0", "K1", "K2", "K3"],
        "C": ["C0", "C1", "C2", "C3"],
        "D": ["D0", "D1", "D2", "D3"],
    }
)

result = pd.merge(left, right, on="key")

answered Feb 18 '22 at 18:53

Steven Robyns

123
6

Greetings. It did work but it also removed the keys that were not present in the right dataframes which is not the case for me unfortunately. P.S. I tried how = "left". Seems like it worked. – JunGi Feb 18 '22 at 19:00
The reason that keys that do not appear in both DataFrames are deleted is because by default, merge() uses how='inner', so it will perform an intersection (only keep keys that appear in both frames). You can change this behaviour by defining your own "how" parameter. The linked documentation provides more details on this. – Steven Robyns Feb 18 '22 at 19:05

How to add multiple columns to an existing dataframe?

1 Answers1