0

I have run the following Python code :

array = ['AEM000', 'AID017']
USA_DATA_1D = USA_DATA10.loc[USA_DATA10['JOBSPECIALTYCODE'].isin(array)]

I run a regression model and extract the log-likelyhood value on each item of this array by a for loop :

for item in array:
    USA_DATA_1D = USA_DATA10.loc[USA_DATA10['JOBSPECIALTYCODE'] == item]
          
    formula = "WEIGHTED_BASE_MEDIAN_FINAL_MEAN ~ YEAR"
    response, predictors = dmatrices(formula, USA_DATA_1D, return_type='dataframe')
    
    mod1 = sm.GLM(response, predictors, family=sm.genmod.families.family.Gaussian()).fit()    
          
    LLF_NG = {'model': ['Standard Gaussian'],
            'llf_value': mod1.llf
            }
    df_llf = pd.DataFrame(LLF_NG , columns = ['model', 'llf_value'])

Now I would like to remane the dataframe df_llf by df_llf_(name of the item) i.e. df_llf_AEM000 when running the loop on the first item and df_llf_AID017 when running the loop on the second one.

I need some help to know how to proceed that.

ALollz
  • 57,915
  • 7
  • 66
  • 89
  • Don't. This is akin to creating a variable number of variables (https://stackoverflow.com/questions/1373164/how-do-i-create-a-variable-number-of-variables) and it's both more complicated to do so and messier. Instead it's far easier to store the dataframes in a dictionary, where the keys of said dictionary are `'AEM000'` and `'AID017'` and the values of that dictionary are the DataFrames. I.e. initalize `d={}` outside the loop and your last line would be `d[item] = pd.DataFrame(LLF_NG , columns = ['model', 'llf_value'])` – ALollz Sep 01 '20 at 15:32

1 Answers1

0

If you want to rename the data frame, you need to use the copy method so that the original data frame does not get altered.

df_llf_AEM000 = df_llf.copy()

If you want to save iteratively several different versions of the original data frame, you can do something like this:

allDataframes = []
for i in range(10):
    df = df_original.copy()
    allDataframes.append(df)
print(allDataframes[0])
Utpal Kumar
  • 806
  • 6
  • 10
  • Indeed I can use this method, but the suffix AEM000 is still added manually. I would like to add it by an automatic way, i.e. as an iterative fields' value. if it is too complicated, It should be a simple number too, as df_llf_1, df_llf_2, etc. then to use the iterative count value. – Jacques Troussart Sep 02 '20 at 07:05
  • @JacquesTroussart See the edit! You can save iteratively generated dfs in a list. – Utpal Kumar Sep 02 '20 at 07:14
  • Thank you very much.It ties up with what I want. – Jacques Troussart Sep 09 '20 at 05:55