Insert columns values in dataframe based on columns of other dataframe

Question

Let's say I originally have a dataframe df:

	name	age
0	Robert	21
1	Bob	22
2	Mike	23

And there's also a dataframe df2:

	name	attribute	value
0	Robert	email	robert@email.com
1	Bob	email	bob@email.com
2	Robert	married?	Yes
3	Mike	married?	No
4	Mike	employed?	Yes
5	Robert	have a car?	Yes
6	Bob	have a car?	No

Beforehand, I already know that a person will have the attributes, for example, "age", "email", "married?", "employed?", "have a car?", "imigrant?", "student?".

How can I get a df_final like this:

	name	age	email	married?	employed?	have a car?
0	Robert	21	robert@email.com	Yes		Yes
1	Bob	22	bob@email.com			No
2	Mike	23		No	Yes

I thought about creating the attributes columns by hand in df and then use df.loc[df['name']==df2['name'], df2['attribute']] to find find the correct indexes to insert the values, but I couldn't get it right.

`df.pivot(index='name', columns='attribute', values='value')` then merge the frames together. — It_is_Chris, Mar 16 '23 at 15:07
Does this answer your question? [How can I pivot a dataframe?](https://stackoverflow.com/questions/47152691/how-can-i-pivot-a-dataframe) — It_is_Chris, Mar 16 '23 at 15:07

James · Accepted Answer · 2023-03-17T11:37:18.173

1

You need to pivot the second data frame, then merge the result to the first. The merge is performed by matching the 'name' column from df to the index (also names) of the pivoted data.

So we need to pass left_on='name' to use the df.name column, and right_index=True to use the index from the pivot data.

To ensure the additional columns are present, we can add them after the merge.

df3 = df.merge(
   df2.pivot(index='name', columns='attribute', values='value'),
   how='left',
   left_on='name',
   right_index=True,
)

for col in ['email', 'married?', 'employed?', 'have a car?', 'imigrant?', 'student?']:
    if not col in df3:
        df3[col] = ''

You will end up with some NA values, which you can leave or fill with .fillna.

edited Mar 17 '23 at 11:37

answered Mar 16 '23 at 15:10

James

32,991
4
47
70

What about the attribute columns that didnt't appear on `df2` ("imigrant?", "student?") ? I'm asking this because I already know which columns needs to be on `df_final`, but I also know that not all of them will appear on `df2`. – NicholasHenrique Mar 16 '23 at 17:09
1

added an update for the missing columns – James Mar 17 '23 at 11:37

Insert columns values in dataframe based on columns of other dataframe

1 Answers1