1

I want to read_excel from folder and load into database, but the excel will refresh every week and change name (ReportWK01, ReportWK02,...) In that folder (names To_Load) is only the one excel I need.

I tried specify path and then read_excel, but I don't know the correct syntax.

path = rb'\\csd-file\dd\bb\ss\uu\To_Load'
results = os.path.join(path, rb"*\*.xlsx")
df = pd.read_excel(results, engine='python')

It's write me

ValueError: Must explicitly set engine if not passing in buffer or path for io.
  • Do you really need that engine parameter `engine='python'` ? – Florian H Jun 03 '19 at 07:55
  • may be this will help: https://stackoverflow.com/questions/20908018/import-multiple-excel-files-into-python-pandas-and-concatenate-them-into-one-dat, also chheck the `glob` func – anky Jun 03 '19 at 07:55

1 Answers1

0
## can you try reading it based on most recent time stamp
import os                                                                   
import glob             

folder_path ='\\csd-file\dd\bb\ss\uu\To_Load'

# glob.glob returns all paths matching the pattern.
excel_files = list(glob.glob(os.path.join(folder_path, '*.xls*')))

mod_dates = [os.path.getmtime(f) for f in excel_files]
print(mod_dates)
# sort by mod_dates.
file_date = sorted(zip(excel_files, mod_dates),reverse=True)
print("*"*100)
print(file_date)
newest_file_path = file_date[0][0]
df = pd.read_excel(newest_file_path)
vrana95
  • 511
  • 2
  • 10