2

hi i have 4 pandas dataframe: df1, df2 ,df3, df4. What i like to do is iterate (using a for loop) the save of this dataframe using to_pickle. what i did is this:

out = 'mypath\\myfolder\\'

r = [ orders, adobe, mails , sells]
for i in r:
    i.to_pickle( out +   '\\i.pkl')

The command is fine but it does not save every database with his name but overwriting the same databse i.pkl (i think because is not correct my code) It seem it can't rename every database with his name (e.g. for orders inside the for loop orders is saved with the name i.pkl and so on with the orders dataframe involved) What i expect is to have 4 dataframe saved with the name inserted in the object r (so : orders.pkl, adobe.pkl ,mails.pkl, sells.pkl)

How can i do this?

Parsifal
  • 340
  • 6
  • 17

1 Answers1

5

You can't stringify the variable name (this is not something you generally do), but you can do something simple:

import os

out = 'mypath\\myfolder\\'

df_list = [df1, df2, df3, df4]
for i, df in enumerate(df_list, 1):
    df.to_pickle(os.path.join(out, f'\\df{i}.pkl')

If you want to provide custom names for your files, here is my suggestion: use a dictionary.

df_map = {'orders': df1, 'adobe': df2, 'mails': df3, 'sells': df4}
for name, df in df_map.items():
    df.to_pickle(os.path.join(out, f'\\{name}.pkl')
cs95
  • 379,657
  • 97
  • 704
  • 746
  • thank you, what if my df list would contains different names? (e.g: df_lis=[ orders, adobe, mails , sells] ) – Parsifal Dec 16 '19 at 09:23
  • @lucapellerossapelles There is no clean way to do it unless you use a dictionary (see edit). – cs95 Dec 16 '19 at 09:28
  • 1
    This is true, as DataFrames do not have a `name` attribute. You must handle the names on your own. A dictionairy is the right choice here. See also this [question](https://stackoverflow.com/questions/31727333/get-the-name-of-a-pandas-dataframe) – AnsFourtyTwo Dec 16 '19 at 09:30