Merge two tables or make a new one

Question

How can I get df3, that is a multiplication of the values in df1 by the coefficients in df2 matching on the type and year.

df1:

   year  type_1  type_2  type_3  pepse
0  2011       1       1       1      1
1  2011       2       2       2      1
2  2011       3       3       3      1
3  2011       4       4       4      1

df2:

   year combo_type   X  Y
0  2011     type_1   1  5
1  2011     type_3  11  6

output df3:

   year  type_1  type_2  type_3  pepse  X_type_1  X_type_2  X_type_3  Y_type_1  Y_type_2  Y_type_3
0  2011       1       1       1      1         1         0        11         5         0         6
1  2011       2       2       2      1         1         0        11         5         0         6
2  2011       3       3       3      1         1         0        11         5         0         6
3  2011       4       4       4      1         1         0        11         5         0         6

dataframes:

df1 = pd.DataFrame({'year':[2011,2011,2011,2011],'type_1':[1,2,3,4],'type_2':[1,2,3,4],'type_3':[1,2,3,4],'pepse':[1,1,1,1]})
df2 = pd.DataFrame({'year':[2011,2011],'combo_type':['type_1','type_3',],'X':[1,11],'Y':[5,6,]})
df3 = pd.DataFrame({'year':[2011,2011,2011,2011],'type_1':[1,2,3,4],'type_2':[1,2,3,4],'type_3':[1,2,3,4],'pepse':[1,1,1,1],'X_type_1':[1,1,1,1],'X_type_2':[0,0,0,0],'X_type_3':[11,11,11,11],'Y_type_1':[5,5,5,5],'Y_type_2':[0,0,0,0],'Y_type_3':[6,6,6,6]})

enter image description here

Is your output exhaustive? What happens to the other factory names? — mozway, Jan 26 '22 at 12:33
Please read [Pandas Merging 101](https://stackoverflow.com/questions/53645882/pandas-merging-101). What are you trying to do, exactly? Please explain in words what is supposed to happen with type_1, type_2, type_3 ; I think you want to do `pd.wide_to_long()` or `pivot()` on the field `df2.combo_type`. Then do the merge. — smci, Jan 26 '22 at 14:31
@mozway, no, this is just a slice, df1 and df2 have 2011-2021. — Astraz, Jan 26 '22 at 14:41
@Astraz can you double check the output I think the one you provided is incorrect. Also, please add a bit a description of what the operations are for clarity to everyone — mozway, Jan 26 '22 at 14:58
@mozway,I added a new picture, maybe it will clarify something for you, sorry for my English — Astraz, Jan 26 '22 at 15:26

mozway · Accepted Answer · 2022-01-26T16:35:01.033

I think there might be a mistake in your output. As you have a duplicated index you first need to add a group helper column to deduplicate otherwise you'll aggregate the data in the pivot steps.

I would do:

cols = ['X', 'Y']

# deduplicate df1 index
# by adding a secondary key where the year is the same
df1 = df1.assign(group=lambda d: d.groupby('year').cumcount())

idx = ['year', 'combo_type']

out = (    # reshape df1 to have one VALUE per row (wide to long)
 df2.merge(df1.melt(id_vars=['year', 'group'], var_name='combo_type'), 
           on=idx)  # merge df2 and reshaped df1 on year/type
    .set_index(idx+['group'])  # set year/group aside 
     # remove the VALUE and use it to multiply the X/Y columns
    .pipe(lambda d: d[cols].mul(d.pop('value'), axis=0))
    .reset_index() # put back index as columns
     # reshape back to wide format
    .pivot_table(index=['year', 'group'], columns='combo_type', fill_value=0)
)

# merge the multiindex levels
out.columns = out.columns.map('_'.join)

# join new columns to original dataframe
df3 = (df1.merge(out, left_on=['year', 'group'], right_index=True)
          .drop(columns='group')
       )

output:

   year  type_1  type_2  type_3  pepse  X_type_1  X_type_3  Y_type_1  Y_type_3
0  2011       1       1       1      1         1        11         5         6
1  2011       2       2       2      1         2        22        10        12
2  2011       3       3       3      1         3        33        15        18
3  2011       4       4       4      1         4        44        20        24

@smci yes, it's a lot of reshaping a combining operations ;) — mozway, Jan 26 '22 at 14:52
I have to say the previous version of the question had more details in it — mozway, Jan 26 '22 at 14:55
Can you add some explanation to the answer, for the key steps? — smci, Jan 26 '22 at 16:23
@smci I documented the code. I suggest you comment (`#`) all the lines in the pipeline and the `out.columns = out.columns.map('_'.join)` line and run the code checking the value of the output `out`. Then uncomment each line one by one to see the steps ;) — mozway, Jan 26 '22 at 16:36

Merge two tables or make a new one

1 Answers1