I'm new to pandas and would like to analyse some data arranged like this:
label aa bb
index
0 [2, 5, 1, 4] [x1, x2, y1, z1]
1 [3, 3, 19] [x3, x4, y2]
2 [6, 4, 2, 8, 9, 10] [y1, y2, z3, z4, x1, w]
in which x1,x2,x3,x4 are of type M; y1,y2 are of type N; and z1,z2,z3,z4 are of type O. Note that data[2,'bb'] is w, which does not belong to any type. This relationship is defined in mongodb as follows
{'_id' : ObjectId(x1), type : 'M'}
{'_id' : ObjectId(y1), type : 'N'}
{'_id' : ObjectId(z1), type : 'O'}...
db.data.find({'_id' : ObjectId(w)}) is null
The desired output would be like this:
label sum_M sum_N sum_O
index
0 7 1 4
1 6 19 0
2 9 10 10
Does anyone know how to do this with pandas?