4

As title, I need to create multiple spreadsheets into a excel file with Pandas. While this thread and this one all provided solutions, I figured my situation is a bit different. Both of the cases use something similar to:

writer = pd.ExcelWriter('output.xlsx')
DF1.to_excel(writer,'Sheet1')
DF2.to_excel(writer,'Sheet2')
writer.save()

However, the problem is that I cannot afford to keep multiple dataframes in my memory at the same time since each of them are just too big. My data can be the complicated version of this: df = pd.DataFrame(dict(A=list('aabb'), B=range(4), C=range(6,10)))
Out: A B C 0 a 0 6 1 a 1 7 2 b 2 8 3 b 3 9

I intend to use the items ['a', 'b', 'c'] in grplist to perform some sort of calculation and eventually generate separate spreadsheets when data['A'] == a through c :

data = pd.read_csv(fileloc)
grplist = [['a','b','c'],['d','e','f']]
for groups, numbers in zip(grplist, range(1, 5)):
    for category in groups:
        clean = data[(data['A'] == category) & (data['B'] == numbers)]['C']
        # --------My calculation to generate a dataframe--------
        my_result_df = pd.DataFrame(my_result)
        writer = ExcelWriter('my_path_of_excel')
        my_resultdf.to_excel(writer, 'Group%s_%s' % (numbers, category[:4]))
        writer.save()
        gc.collect()

Sadly my code does not create multiple spreadsheets as groups, numbers are looped through. I can only get the last result in the single spreadsheet lying in my excel. What can I do?

This is my very first post here. I hope I'm following every rules so this thread can end well. If anything need to be modified or improved, please kindly let me know. Thanks for your help :)

Community
  • 1
  • 1
Chia-Yu Chien
  • 311
  • 3
  • 6

1 Answers1

1

consider the df

df = pd.DataFrame(dict(A=list('aabb'), B=range(4)))

loop through groups and print

for name, group in df.groupby('A'):
    print('{}\n\n{}\n\n'.format(name, group))

a

   A  B
0  a  0
1  a  1


b

   A  B
2  b  2
3  b  3

to_excel

writer = pd.ExcelWriter('output.xlsx')
for name, group in df.groupby('A'):
    group.to_excel(writer, name)
writer.save()
writer.close()
piRSquared
  • 285,575
  • 57
  • 475
  • 624
  • Thanks [tag:piRSquared] for your prompt response. However, this does not solve my problem. I've added a few more lines to describe my data and code. I guess the ultimate question I'm asking is: "How to create a multiple spreadsheets once a time" instead of grouping them together using `.groupby()`. If I am not understanding you correctly, please let me know. – Chia-Yu Chien Oct 27 '16 at 06:19