I'm trying to check if some files in drive A are present or missing in another drive B, so I did this little script:
import os
import subprocess
# drive A
SOURCE_PATH = '/media/username/8e223d5b-2755-4e9f-a2f6-fac5e762e836/username'
# drive B
DESTINY_PATH = '/home/username/'
SUCCESS_CODE = 0
if __name__ == '__main__':
source_file = ''
destiny_file = ''
for source_actualdir, source_subdir, source_dirFiles in os.walk(SOURCE_PATH):
for source_filename in source_dirFiles:
for destiny_actualdir, destiny_subdir, destiny_dirFiles in os.walk(DESTINY_PATH):
for destiny_filename in destiny_dirFiles:
source_file = os.path.join(source_actualdir, source_filename)
destiny_file = os.path.join(destiny_actualdir, destiny_filename)
response = subprocess.run(['diff', '-s', f'{source_file}', f'{destiny_file}'], capture_output=True)
if response.returncode == SUCCESS_CODE:
print(f'Coincidencia {source_file} == {destiny_file}')
break
print(f'File {source_file} is missing in {DESTINY_PATH}')
but I find it too slow when running it (I have to check 242603 files for a total of 145GBs) and I'd like to speed it up but I don't know how.
What can I use for such a task?