0

I need to find a way to move only new files from a remote folder to my local. Using Python to walk the drive and get timestamps doesn't really work since the drive is too big and there's considerable latency. I am working on Ubuntu and the remote windows drive is mounted with samba.

Any ideas greatly appreciated!

benr
  • 45
  • 6
  • It depends on a cloud folder provider, its abilities, sync protocol, etc. What cloud storage are you talking about? – Andrey Oct 13 '22 at 12:43
  • refer https://stackoverflow.com/a/39327156/7887883 – Pavan Kumar T S Oct 13 '22 at 12:44
  • Pavan, the solution proposed is not feasible due to the size of the folders I need to go through. Andrey, its not really a cloud provider, its just a remote drive accesible through VPN – benr Oct 13 '22 at 12:55

1 Answers1

0

According to os.scandir documentation

Using scandir() instead of listdir() can significantly increase the performance of code that also needs file type or file attribute information

The code below on my machine can scan ~100'000 files over a USB3 cable in 0.4".

import os

def files_search(root, time):
    files = []
    stack = [root]
    while stack:
        dirname = stack.pop()
        try:
            with os.scandir(dirname) as it:
                for e in it:
                    if e.stat().st_mtime > time:
                        files.append(e.path)
                    elif e.is_dir(): 
                        stack.append(e.path)
        except PermissionError:
            print(f'WARNING: [WinError 5] Access is denied: {dirname}')
            
    return files
Luca
  • 1,610
  • 1
  • 19
  • 30