I am having troubles understanding how pandas multiindex work. Specifically:
- how to merge two dataframes of different index level (by row)
- how can change index level for a dataframe
Using an example from a previous question:
d1 = pd.DataFrame( {'StudentID': ["x1", "x10", "x2","x3", "x4", "x5", "x6", "x7", "x8", "x9"],
'StudentGender' : ['F', 'M', 'F', 'M', 'F', 'M', 'F', 'M', 'M', 'M'],
'ExamenYear': ['2007','2007','2007','2008','2008','2008','2008','2009','2009','2009'],
'Exam': ['algebra', 'stats', 'bio', 'algebra', 'algebra', 'stats', 'stats', 'algebra', 'bio', 'bio'],
'Participated': ['no','yes','yes','yes','no','yes','yes','yes','yes','yes'],
'Passed': ['no','yes','yes','yes','no','yes','yes','yes','no','yes']},
columns = ['StudentID', 'StudentGender', 'ExamenYear', 'Exam', 'Participated', 'Passed'])
I compute two datasets
def ZahlOccurence_0(x):
return pd.Series({'All': len(x['StudentID']),
'Part': sum(x['Participated'] == 'yes'),
'Pass' : sum(x['Passed'] == 'yes')})
t1 = d1.groupby(['ExamenYear', 'Exam']).apply(ZahlOccurence_0)
t2 = d1.groupby('ExamenYear').apply(ZahlOccurence_0)
How can I merge t1 and t2 by rows ?
print t1
All Part Pass
ExamenYear Exam
2007 algebra 1 0 0
bio 1 1 1
stats 1 1 1
2008 algebra 2 1 1
stats 2 2 2
2009 algebra 1 1 1
bio 2 2 1
print t2
All Part Pass
ExamenYear
2007 3 2 2
2008 4 3 3
2009 3 3 2
I tried the following
t2 = t2.set_index([t2.index, np.array(['tot']* 3)], append = False)
BUT
pd.concat(t1,t2)
produces an error
ValueError: Cannot call bool() on DataFrame.
What I am doing wrong ?
Thanks in advance