Append Columns to Dataframe 1 Based on Matching Column Values in Dataframe 2

Question

I have two dataframes, df1 and df2. The first of these dataframes tracks the locations (i.e., ZIP codes) of specific individuals at different time points:

ID	ZIP 1	ZIP 2	ZIP 3
1	55333	N/A	55316
2	55114	27265	27265
3	55744	55744	N/A

The second dataframe contains several columns of data pertaining to every ZIP code in the country (many of which do not appear in df1):

ZIP	State	Tier
01001	MA	1
...	...	...
27265	NC	2
55114	MN	4
55316	MN	7
55333	MN	5
55744	MN	3

I would like to merge these dataframes and append the variable names from df2 to the ends of the corresponding ZIP/time point variable in df1 to end up with something like this (Note: I removed the ZIP 3 variable for simplicity; I'd still like to append the relevant State and Tier data, as shown for ZIP 1 and ZIP 2):

ID	ZIP 1	ZIP 2	ZIP 1 State	ZIP 2 State	ZIP 1 Tier	ZIP 2 Tier
1	55333	N/A	MN	N/A	5	N/A
2	55114	27265	MN	NC	4	2
3	55744	55744	MN	MN	3	3

The closest solution I have come up with is to create multiple "merged" dataframes by merging on each individual ZIP code variable in df1. This is obviously less than ideal, and does not resolve the variable naming issue either.

merged = pd.merge(df1, df2, left_on = 'ZIP 1', right_on = 'ZIP', how = 'left')
merged2 = pd.merge(df1, df2, left_on = 'ZIP 2', right_on = 'ZIP', how = 'left')
merged3 = pd.merge(df1, df2, left_on = 'ZIP 3', right_on = 'ZIP', how = 'left')

Any guidance would be much appreciated! :-)

I think this could be done with joins if you are familiar with them — wjmccann, Apr 18 '22 at 16:55

score 0 · Answer 1 · answered Apr 18 '22 at 16:50

Try something like this:

dfs = df1.set_index('ID').stack().rename('ZIP').reset_index().drop('level_1', axis=1)
dfm = dfs.merge(df2)
df_out =dfm.set_index(['ID', dfm.groupby('ID').cumcount() +1]).unstack()
df_out.columns = [f'{i} {j}' for i, j in df_out.columns]
print(df_out)

Output:

    ZIP 1  ZIP 2  ZIP 3 State 1 State 2 State 3 Tier 1 Tier 2 Tier 3
ID                                                                  
1   55333  55316    NaN      MN      MN     NaN      5      7    NaN
2   55114  27265  27265      MN      NC      NC      4      2      2
3   55744  55744    NaN      MN      MN     NaN      3      3    NaN

Append Columns to Dataframe 1 Based on Matching Column Values in Dataframe 2

1 Answers1

Linked