0

I am trying to write a function that accepts data and filename as arguments, with the former being the data that is to be saved under the name of the latter. I would like to keep it inside a function and not resort to list comprehension as I did to make OutWriter3 work via test3.

import numpy as np
import pandas as pd

data_a = np.random.randint(100, size=21)
data_b = np.random.randint(200, size=21)

def OutWriter1(data, filename):
    for d, f in zip(data, filename):
        out = pd.DataFrame(d)
        return out.to_excel(f, header=False, index=False)

def OutWriter2(data, filename):
    out = []
    for d, f in zip(data, filename):
        df = pd.DataFrame(d)
        out.append(df)
        out = pd.DataFrame(out)
        return out.to_excel(f, header=False, index=False)

def OutWriter3(data, filename):
    out = pd.DataFrame(data)
    return out.to_excel(filename, header=False, index=False)

test1 = OutWriter1([data_a, data_b], ['data_a_1.xlsx', 'data_b_1.xlsx'])
test2 = OutWriter2([data_a, data_b], ['data_a_2.xlsx', 'data_b_2.xlsx'])
test3 = [OutWriter3(i, j) for i, j in zip([data_a, data_b], ['data_a_3.xlsx', 'data_b_3.xlsx'])]

data_a_1.xlsx from OutWriter1 is correct but data_b_1.xlsx is nonexistent, data_a_2.xlsx is entirely wrong and data_b_2.xlsx is also nonexistent. However, data_a_3.xlsx and data_b_3.xlsx are correct.

Inspired by another question, I also tried to save data_a and data_b as sheets within a single Excel file without any luck (AttributeError: 'list' object has no attribute 'write').

def OutWriter4(data, filename):
    data = pd.DataFrame(data)
    with pd.ExcelWriter(filename) as writer:
        for n, df in enumerate(data):
            df.to_excel(writer, 'sheet%s' % n)
        writer.save()

test4 = OutWriter4([data_a, data_b], ['data_a_4.xlsx', 'data_b_3.xlsx'])
  • Is there an elegant way to create a function that creates Excel files provided the data and filename?
  • Is there also an elegant way to create a function that writes the different data to assigned sheets within a single Excel file?
naughty_waves
  • 265
  • 1
  • 13

1 Answers1

2

Solution

You could use the following code to either write multiple dataframes to a single excel file, or each of the dataframes to a single excel file.

target = 'single_file.xlsx'
targets = ['mult_1.xlsx', 'mult_2.xlsx', 'mult_3.xlsx']
# df1 = pd.DataFrame(data_a)
# df2 = pd.DataFrame(data_b)
# df3 = pd.DataFrame(data_c)
# dfs = [pd.DataFrame(x) for x in [data_a, data_b]] # >> in your case
dfs = [df1, df2, df3]

# to write to a single file
write_to_excel(targets = target, dfs = dfs, verbose=1)

# to write to multiple files
write_to_excel(targets = targets, dfs = dfs, verbose=1)

Custom Defined Function

def write_to_excel(targets, dfs: list, verbose=1):
    """Writes single of multiple dataframes to either a single or multiple excel files.

    targets: str or list of str --> path(s) excel files
    dfs: list of dataframes
    Example
    -------
    target = 'single_file.xlsx'
    targets = ['mult_1.xlsx', 'mult_2.xlsx', 'mult_3.xlsx']
    dfs = [df1, df2, df3]

    # to write to a single file
    write_to_excel(targets = target, dfs = dfs, verbose=1)

    # to write to a multiple files
    write_to_excel(targets = targets, dfs = dfs, verbose=1)
    """
    if not isinstance(targets, list):
        targets = [targets]
    writer_scheme = 'single' if len(target)==1 else 'multi'
    if verbose>=1:
        print(f'excel-writer-scheme: {writer_scheme}-file(s)')
    if writer_scheme == 'single':
        with pd.ExcelWriter(targets[0]) as writer:
            for i, df in enumerate(dfs):  
                df.to_excel(writer, sheet_name=f'Sheet_{i+1}')
    else:
        for df, target in zip(dfs, targets):
            df.to_excel(target, sheet_name='data')

References

  1. https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_excel.html
CypherX
  • 7,019
  • 3
  • 25
  • 37
  • @naughty_waves Please consider **`accepting`** the answer, if it helped. – CypherX Apr 18 '20 at 19:15
  • Thank you, @CypherX. I have one question regarding the custom function. First, I tried to run it with my numpy arrays (`data_a` and `data_b`) but that did not work. Second, I decided to try to convert them into dataframes (`df_a=data_a` and `df_b=data_b`) by passing them as `test5=write_to_excel([data_a_5.xlsx, data_b_5.xlsx], [df_a, df_b])` but that did not work either. `AttributeError: 'list' object has no attribute 'write'` appeared both times. I am convinced the fault is at my side. Do you have any idea what I am doing wrong? – naughty_waves Apr 18 '20 at 21:12
  • Use this: `dfs = [pd.DataFrame(x) for x in [data_a, data_b]]` and then pass this `dfs` to the function. You need a list of dataframes. Initially, `data_a` and `data_b` are `numpy` arrays. Hence, you must convert them as dataframes as I showed here. – CypherX Apr 18 '20 at 21:15
  • I did a mistake in the previous comment. I meant that I converted them by writing `df_a=pd.DataFrame(data_a)` and `df_b=pd.DataFrame(data_b)`. It works now but I had to change `len(target)` to `len(targets)`. Thank you for helping me out, @CypherX! – naughty_waves Apr 18 '20 at 21:31