Flatten DataFrame by group with columns creation in Pandas

Question

I have the following pandas DataFrame


   Id_household  Age_Father  Age_child
0             1          30          2
1             1          30          4
2             1          30          4
3             1          30          1
4             2          27          4
5             3          40         14
6             3          40         18

and I want to achieve the following result

              Age_Father  Age_child_1  Age_child_2  Age_child_3  Age_child_4
Id_household                                                                
1                     30            1          2.0          4.0          4.0
2                     27            4          NaN          NaN          NaN
3                     40           14         18.0          NaN          NaN

I tried stacking with multi-index renaming, but I am not very happy with it and I am not able to make everything work properly.

@YOBEN_S, I do not agree with your duplicate tag! I have been lucky to have Scott Boston to answer to my question perfectly. I looked to the question you associated my question with, and I have not been able to find my answer in https://stackoverflow.com/questions/47152691/how-to-pivot-a-dataframe I do not think that my question is a "real" duplicate in the sense "of cut-and-pastes, accidental duplicates, and borderline duplicates" https://meta.stackexchange.com/questions/10841/how-should-duplicate-questions-be-handled . Thank you for the link, sorry for not agreeing with your decision. — Adrien Pacifico, Jun 09 '20 at 16:08
Check Q10 and Q11 , that question cover almost all situation for pivot — BENY, Jun 09 '20 at 16:10
I like the answer you linked to, and I do agree that Q10 an Q11 provide most of what is needed to answer to my specific case. But to my opinion, it does not. With my cognitive ability and knowledge I would not have been able to create @Scott Boston answer... And I have spent time searching for a solution to my problem both on SO and the official doc... — Adrien Pacifico, Jun 09 '20 at 16:23

score 2 · Accepted Answer · answered Jun 09 '20 at 15:23

Use this:

df_out = df.set_index([df.groupby('Id_household').cumcount()+1, 
                       'Id_household', 
                       'Age_Father']).unstack(0)
df_out.columns = [f'{i}_{j}' for i, j in df_out.columns]
df_out.reset_index()

Output:

   Id_household  Age_Father  Age_child_1  Age_child_2  Age_child_3  Age_child_4
0             1          30          2.0          4.0          4.0          1.0
1             2          27          4.0          NaN          NaN          NaN
2             3          40         14.0         18.0          NaN          NaN

Flatten DataFrame by group with columns creation in Pandas

1 Answers1