2

I have two Dataframes, G1 and G2:

G1:

studentId midterm homework final
101       45       69       89  
102       20       22       11
103       11       11       11 

G2:

studentId midterm homework final
101       55       69       89  
102       20       65       11
103       11       11       11

Trying to get the output as below:

student           Old  New
101     Midterm    45   55
102     Homework   22   65   
anky
  • 74,114
  • 11
  • 41
  • 70
  • Am not able to add below code in main question, sorry for that: code i tried: code: `GBool = (G1!= G2).stack() Gdiff = pd.concat([G1.stack()[GBool],G2.stack() [GBool]], axis=1) Gdiff.columns=["Old", "New"] print(Gdiff) but am not able to get the student id in output – Geetha Anand Dec 14 '19 at 07:00
  • @ jezrael, i referred the link you suggested, but not solved my problem – Geetha Anand Dec 14 '19 at 07:02
  • What is error here? – jezrael Dec 14 '19 at 07:04
  • Am able to fetch the differences, but i want to print the first column(studentId in this case) it is giving error: Tried with below code: new_Gdiff = pd.concat([G1db.iloc[0:,0],diff[],axis=1) getting error as : AttributeError: 'list' object has no attribute 'columns'. Need help on this. – Geetha Anand Dec 14 '19 at 07:07

2 Answers2

2

Create index by columns studentId for both DataFrames, also is possible add parameter keys to concat for new columns names:

G1 = G1.set_index('studentId')
G2 = G2.set_index('studentId')
GBool = (G1!= G2).stack() 
Gdiff = pd.concat([G1.stack()[GBool],G2.stack() [GBool]], axis=1, keys=('Old','New'))
print(Gdiff) 
                    Old  New
studentId                   
101       midterm    45   55
102       homework   22   65

If is necessary convert MultiIndex to columns:

Gdiff = (pd.concat([G1.stack()[GBool],
                   G2.stack()[GBool]], 
                   axis=1, 
                   keys=('Old','New')).rename_axis(('studentId','type')).reset_index())
print(Gdiff) 
   studentId      type  Old  New
0        101   midterm   45   55
1        102  homework   22   65
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
2

Here is an alternate way using merge and pd.wide_to_long:

m=df1.merge(df2,on='studentId',suffixes=('_old','_new'))
n=(pd.wide_to_long(m,df1.columns[1:],'studentId','type',sep='_',suffix='\w+')
                                                         .unstack().stack(0))
n.query("new!=old")

           type    new old
studentId           
101       midterm   55  45
102       homework  65  22
anky
  • 74,114
  • 11
  • 41
  • 70