0

I have a table of model results in a DataFrame with independent variable names and model statistics in the index and coefficients and p-values for each model in separate columns. I would like to organize my columns by model accuracy (by the value in the "f-stat" row and "Coefs" column) while maintaining the structure that coefficient columns are immediately followed by their p-value column.

For example:

(Model 1, Coefs) (Model 1, Prob) ... (Model 3, Coefs) (Model 3, Prob)
Const -0.00110 9.04e-01 ... 0.030600 0.000000
Var 1 0.124000 0.00e+00 ... 0.683400 0.021200
Var 2 Nan Nan ... 0.337700 0.065100
f-stat 27.76398 9.46e-10 ... 6.794688 0.000405

Supposing the ordering of model accuracy is Model 2, Model 1, Model 3, then I would like a way to generate a list that is [('Model 2','Coefs'),('Model 2','Prob'),('Model 1','Coefs'),('Model 1','Prob'),('Model 3','Coefs'),('Model 3','Prob')] so that I can sort the columns within the table.

I've tried using

[[(x[0],'Coefs'),(x[0],'Prob')] for x in dictOLS[Y][[x for x in dictOLS[Y].columns if x[1]=='Coefs']].loc['f-stat'].sort_values(ascending=False).index]

And

[*[(x[0],'Coefs'),(x[0],'Prob')] for x in dictOLS[Y][[x for x in dictOLS[Y].columns if x[1]=='Coefs']].loc['f-stat'].sort_values(ascending=False).index]

But the first one returns a list of lists, and the second returns an error.

It_is_Chris
  • 13,504
  • 2
  • 23
  • 41
  • 1
    Try `[y for x in dictOLS[Y][[x for x in dictOLS[Y].columns if x[1]=='Coefs']].loc['f-stat'].sort_values(ascending=False).index for y in [(x[0],'Coefs'),(x[0],'Prob')]]` in order to flatten your list of lists. Also, you can refer to: https://stackoverflow.com/questions/952914/how-do-i-make-a-flat-list-out-of-a-list-of-lists – Swifty May 17 '23 at 19:28

0 Answers0