2

I am trying to map using columns in df1 and df2 but still cannot figure out to find the desired output.
How do I map or assign by using Id and Prod columns in df1, or else there is another way to do so?

import pandas as pd

df1 = pd.DataFrame({'Name': ['Mike', 'John','Paul' ,'Paul','Paul', 'Mike','John','Mike' ],
                    'Id': ['111', '111', '111','111', '222', '222', '222', '222'],
                    'Prod': ['Shoes', 'Clothes','Shoes', 'Clothes', 'Clothes', 'Shoes','Clothes', 'Shoes' ],
                    'Quarter': ['2016Q2', '2016Q2','2016Q2','2017Q1','2017Q2','2017Q2','2017Q3', '2017Q4']})
df2 = pd.DataFrame({'Id': ['111','111','111','111','111','111','111','111','222','222','222','222','222','222','222','222'],
                   'Quarter': ['2016Q1', '2016Q2', '2016Q3', '2016Q4','2017Q1', '2017Q2', '2017Q3', '2017Q4','2016Q1', '2016Q2', '2016Q3', '2016Q4','2017Q1', '2017Q2', '2017Q3', '2017Q4'],
                   'Shoes': ['Nan', '100','200','150','300','400','140', 'Nan','240', '320','200','150','333','800','470', 'Nan'],
                   'Clothes' : ['200','110','300','50','180','90','0','70','Nan','50','180','90','Nan','90','200','110']})

df1

Name    Id  Prod    Quarter
Mike    111 Shoes   2016Q2
John    111 Clothes 2016Q2
Paul    111 Shoes   2016Q2
Paul    111 Clothes 2017Q1
Paul    222 Clothes 2017Q2
Mike    222 Shoes   2017Q2
John    222 Clothes 2017Q3
Mike    222 Shoes   2017Q4
df2

Id  Quarter Shoes   Clothes
111 2016Q1  Nan     200
111 2016Q2  100     110
111 2016Q3  200     300
111 2016Q4  150     50
111 2017Q1  300     180
111 2017Q2  400     90
111 2017Q3  140     0
111 2017Q4  Nan     70
222 2016Q1  240     Nan
222 2016Q2  320     50
222 2016Q3  200     180
222 2016Q4  150     90
222 2017Q1  333     Nan
222 2017Q2  800     90
222 2017Q3  470     200
222 2017Q4  Nan     110

Expected output:

Name Id Prod Quarter Shoes Clothes
Mike 111 Shoes 2016Q2 100 Nan
John 111 Clothes 2016Q2 Nan 110
Paul 111 Shoes 2016Q2 100 Nan
Paul 111 Clothes 2017Q1 Nan 180
Paul 222 Clothes 2017Q2 Nan 90
Mike 222 Shoes 2017Q2 800 Nan
John 222 Clothes 2017Q3 Nan 200
Mike 222 Shoes 2017Q4 Nan 110
Jason
  • 41
  • 4

1 Answers1

0

First, use df1.merge(df2) and then change Shoes and Clothes values.

df3 = df1.merge(df2)

df3['Clothes'] = ['Nan' if df1['Prod'].iloc[i]=='Shoes' else df3['Clothes'].iloc[i] for i in range(8)]
df3['Shoes'] = ['Nan' if df1['Prod'].iloc[i]=='Clothes' else df3['Shoes'].iloc[i] for i in range(8)]

output:

   Name   Id     Prod Quarter Shoes Clothes
0  Mike  111    Shoes  2016Q2   100     Nan
1  John  111  Clothes  2016Q2   Nan     110
2  Paul  111    Shoes  2016Q2   100     Nan
3  Paul  111  Clothes  2017Q1   Nan     180
4  Paul  222  Clothes  2017Q2   Nan      90
5  Mike  222    Shoes  2017Q2   800     Nan
6  John  222  Clothes  2017Q3   Nan     200
7  Mike  222    Shoes  2017Q4   Nan     Nan
Qavi
  • 36
  • 3