This is similar to the problem I asked here. However, I found out that the data I am working is not always consistent. For, example say :
import pandas as pd
df = pd.DataFrame(pd.DataFrame([[1,2,3,4],[5,6,7,8],[9,10,11,12]],columns=["X_a","Y_c","X_b","Y_a"]))
X_a Y_c X_b Y_a
0 1 2 3 4
1 5 6 7 8
2 9 10 11 12
Now you can see that X
does not have corresponding c
column and Y
does not have corresponding b
column. Now when I want to create the multi-level index, I want the dataframe to look like this:
X Y
a b c a b c
0 1 3 -1 4 -1 2
1 5 7 -1 8 -1 6
2 9 11 -1 12 -1 10
So as you can see, I want the split in such a way that all upper level columns should have the same lower level columns. Since, the dataset is positve, I am thinking of filling the missing columns with -1, although I am open for suggestions on this. The closest thing I found to my problem was this answer. However, I cannot make it to somehow work with MultiLevel Index like in my previous question. Any help is appreciated.