i have written a function in Python that reads either a .csv or .xls and return it as a pandas dataframe. Based on the passed file_type the function uses either the pandas.read_csv() or pandas.read_excel() function with (just one slight) difference in the parameters. It works without an issue but its obviously repeated code i would like to reduce. So how could i best:
- Have just one function call that is dynamically changed to the specific one defined by the file_type variable
- Dynamically change the parameters of the then called function based on the same variable?
Here is my current code. Thanks for your help.
def file_to_df(file_name, fields= None, file_type = None, encoding = None):
"""Read stock level from csv or xlsx file.Filter SKU and Qty.Return dataframe."""
if file_type == 'csv' or 'xls':
if file_type == 'csv':
data_frame = pd.read_csv(
file_name,
encoding = encoding,
converters={'Barcode':str,'Qty':int},
usecols=fields
)
elif file_type == 'xls':
data_frame = pd.read_excel(
file_name,
converters={'Barcode':str,'Qty':int},
usecols=fields
)
# Remove empty rows
data_frame.replace('', np_nan, inplace=True)
data_frame.dropna(axis=0, how='any', subset=None, inplace=True)
return data_frame
else:
print('no csv or xls filetype was handed to file_to_df')
For the parameters i tried using two tuples that are put into the function call.