1

I have a directory where i have xls files which are being updated and new xls files are being added.

I have combined all of the data using-

import os
import pandas as pd
def read_sheets(filename):
    result = []
    sheets = pd.read_excel(filename, sheet_name=None)
    for name, sheet in sheets.items():
        sheet['Sheetname'] = name
        sheet['Row'] = sheet.index
        result.append(sheet)
    return pd.concat(result, ignore_index=True)
def read_files(filenames):
    result = []
    for filename in filenames:
        file = read_sheets(filename)
        file['Filename'] = filename
        result.append(file)
    return pd.concat(result, ignore_index=True)

files = ['1.xls', '2.xls','3.xls','4.xls','5.xls']
dfoo = read_files(files)

But I want to know if any changes are being made to these xls files, how can I automate appends to dfoo from these files and if new files are created lets say 6.xls or 7.xls later(which will have same column headers) how can that data also be appended to dfoo

  • You just want to automatically fill `files` and run the code against, correct? – Guimoute Mar 16 '22 at 17:24
  • essentially, yes. whenever there are updates to any of the xls in the directory folder or if new xls files are added –  Mar 16 '22 at 17:26
  • This should list the spreadsheets found in the desired folder. `files = [file for file in os.listdir(folder_path) if file.endswith(".xls")]` – Guimoute Mar 16 '22 at 17:31
  • im not trying to get the names of the new files. I need the whatever is in the file appended to dfoo –  Mar 16 '22 at 17:48
  • 1
    Just run your module periodically on all the files, each iteration should have all current information. – wwii Mar 16 '22 at 18:37
  • @wwii yes i plan to run on task scheduler. But I am wondering how i can account for files that havent been created yet. Lets say in the future there is a '9.xls'. How do I make the code pick that up. currently if i try to write that in my [files] list. it gives an error because the file doesnt yet exist in the directory. But in the future it will. So what should I change in that code so it could account for future files –  Mar 16 '22 at 19:13
  • 2
    What I wrote will grab the "future" files for you. – Guimoute Mar 16 '22 at 20:03
  • 1
    [Find all files in a directory with extension .txt in Python](https://stackoverflow.com/questions/3964681/find-all-files-in-a-directory-with-extension-txt-in-python) – wwii Mar 16 '22 at 20:47

0 Answers0