How to resume from previous failed iteration while traversing through a big nested directory in Python?

Question

I am currently using os.walk to navigate through all subfolders and files in a massive Network drive directory, However, Whenever my VPN disconnects, The for loop fails. Next day when I re-run my code, I would like to resume from the last file that was processed. What modifications should I make in my code below?

import os

directory = '//DirectoryName/FolderName'

for root, dirs, files in os.walk((os.path.normpath(directory)), topdown=False):
  for name in files:
        Source_File = os.path.join(root,name)
        #This loads the file to s3 bucket
        s3_client.upload_file(Source_File, bucket, Target_File)

The directory is really massive, Has hundreds of sub-folders, and thousands of files in total.

Keep track of the files you already processed in a separate file — rdas, Sep 16 '22 at 17:56
@treuss, What do you mean? I am doing this work as a part of my job. — Devansh Popat, Sep 16 '22 at 18:40
@rdas, That is a good point. But how do I resume from where I left off the previous day? — Devansh Popat, Sep 16 '22 at 18:40
You read the file at the start of the script loading all the file names into a set or something similar. Then when walking the directory tree, you can skip any files which are already in the set. — rdas, Sep 16 '22 at 18:44
[python - Continue from given folder when walking recursively through folder - Stack Overflow](https://stackoverflow.com/questions/73723605/continue-from-given-folder-when-walking-recursively-through-folder/73725393#73725393) — furas, Sep 16 '22 at 23:07
maybe you should use `rsync` and it should check if you have newer file and send only newer - so it will skip files which you alread send. It will also resend last file if it is was send only partially. [backup - Rsync to AWS S3 bucket - Server Fault](https://serverfault.com/questions/754690/rsync-to-aws-s3-bucket) — furas, Sep 16 '22 at 23:10

How to resume from previous failed iteration while traversing through a big nested directory in Python?

0 Answers0