I have a table of model results in a DataFrame with independent variable names and model statistics in the index and coefficients and p-values for each model in separate columns. I would like to organize my columns by model accuracy (by the value in the "f-stat" row and "Coefs" column) while maintaining the structure that coefficient columns are immediately followed by their p-value column.
For example:
(Model 1, Coefs) | (Model 1, Prob) | ... | (Model 3, Coefs) | (Model 3, Prob) | |
---|---|---|---|---|---|
Const | -0.00110 | 9.04e-01 | ... | 0.030600 | 0.000000 |
Var 1 | 0.124000 | 0.00e+00 | ... | 0.683400 | 0.021200 |
Var 2 | Nan | Nan | ... | 0.337700 | 0.065100 |
f-stat | 27.76398 | 9.46e-10 | ... | 6.794688 | 0.000405 |
Supposing the ordering of model accuracy is Model 2, Model 1, Model 3, then I would like a way to generate a list that is [('Model 2','Coefs'),('Model 2','Prob'),('Model 1','Coefs'),('Model 1','Prob'),('Model 3','Coefs'),('Model 3','Prob')]
so that I can sort the columns within the table.
I've tried using
[[(x[0],'Coefs'),(x[0],'Prob')] for x in dictOLS[Y][[x for x in dictOLS[Y].columns if x[1]=='Coefs']].loc['f-stat'].sort_values(ascending=False).index]
And
[*[(x[0],'Coefs'),(x[0],'Prob')] for x in dictOLS[Y][[x for x in dictOLS[Y].columns if x[1]=='Coefs']].loc['f-stat'].sort_values(ascending=False).index]
But the first one returns a list of lists, and the second returns an error.