good day,
I am attempting to open multiple excel files (xls
) files and put them in one data frame. I am using .glob()
to access the files here:
all_files = glob.glob('D:\Anaconda Hub\ARK analysis\Ark analysis\data\year2021\\february\\**.xls')
The sample output is a list so:
['D:\\Anaconda Hub\\ARK analysis\\Ark analysis\\data\\year2021\\february\\ARK_Trade_02012021_0619PM_EST_601875e069e08.xls',
'D:\\Anaconda Hub\\ARK analysis\\Ark analysis\\data\\year2021\\february\\ARK_Trade_02022021_0645PM_EST_6019df308ae5e.xls',
'D:\\Anaconda Hub\\ARK analysis\\Ark analysis\\data\\year2021\\february\\ARK_Trade_02032021_0829PM_EST_601b2da2185c6.xls',
'D:\\Anaconda Hub\\ARK analysis\\Ark analysis\\data\\year2021\\february\\ARK_Trade_02042021_0637PM_EST_601c72b88257f.xls',
'D:\\Anaconda Hub\\ARK analysis\\Ark analysis\\data\\year2021\\february\\ARK_Trade_02052021_0646PM_EST_601dd4dc308c5.xls',
'D:\\Anaconda Hub\\ARK analysis\\Ark analysis\\data\\year2021\\february\\ARK_Trade_02082021_0629PM_EST_6021c739595b0.xls'..]
I am using the olefile
method. Here is my code:
import os
import glob
import olefile as ol
import pandas as pd
# using olefile to iterate to extract each excel file to be readible
with open(all_files,'r') as file:
if file.endswith('.xls'):
ole = ol.OleFileIO(file)
if ole.exists('Workbook'):
d = ole.openstream('Workbook')
df = pd.read_excel(d, engine='xlrd', header=3, skiprows=3)
print(df.head())
However, I get this error:
TypeError: expected str, bytes or os.PathLike object, not list
I am not understanding why I am obtaining this error. I am iterating over the list to select a string and pass it through the rest of the steps... Help would be appreciated to do this correctly and get the excel files to output in a single data frame. Thanks in advance