-1

I have a source directory with sub directories with files. I also have a destination directory with sub directories with another structure.

fileNames = <get all file names from source directory>
for fileName in fileNames {
    if <not found in destination directory> {
         print fileName
    }
}

How can I do pseudo code above?

EDIT:

Example file structure:
./sourcedir/file1.txt
./sourcedir/foldera/file2.txt
./sourcedir/foldera/missingfile.txt

./destdir/file2.txt
./destdir/folderb/file1.txt

So missingfile.txt should be printed. But not file1.txt or file2.txt since they can be found under destdir somewhere.

EDIT2: I managed to do a Python implementation this was what was aiming for. I had some trouble with the bash answers when trying them. Can it be done simpler in bash?

import os
import fnmatch

sourceDir = "./sourcedir"
destinationDir = "./destdir"

def find_files(directory, pattern):
    for root, dirs, files in os.walk(directory):
        for basename in files:
            if fnmatch.fnmatch(basename, pattern):
                filename = os.path.join(root, basename)
                yield filename

print sourceDir
for sourcefilename in find_files(sourceDir, '*'):
     #if not sourcefilename.lower().endswith(('.jpg', '.jpeg', '.gif', '.png','.txt','.mov','3gp','mp4','bmp')):
     #  continue
     shouldPrint = True
     for destfilename in find_files(destinationDir, '*'):
         sourceBaseName = os.path.basename(sourcefilename)
         destBaseName = os.path.basename(destfilename)
         if sourceBaseName == destBaseName:
             shouldPrint = False
             break
     if shouldPrint:
         print 'Missing file:', sourcefilename
user317706
  • 2,077
  • 3
  • 19
  • 18
  • You have tagged this as [tag:bash] and [tag:python]. Are you assigning tags randomly, or do you require the solution to be specifically in either of these languages (why?)? – tripleee Jul 22 '16 at 19:44
  • Good point. I immediately though those two tags would make sense to use, so I could understand the answer more easily. Maybe it is better to stick to one language to avoid mixing things up to much. – user317706 Jul 22 '16 at 19:52

2 Answers2

1

Using bash this can be easily done by running diff -r source_dir target_dir | grep Only.*source_dir | awk '{print $4}'.

  • diff -r source_dir target_dir shows the differences between source_dir and target_dir
  • grep Only.*source_dir will filter out all files existing in the source directory but not in the target directory
  • awk '{print $4}' will filter out the file name
Tom Gijselinck
  • 2,398
  • 1
  • 13
  • 11
0

A bit of a hack, but you could do something with find and diff, no Python needed:

diff -u <(cd sourcedir && find . -type f) <(cd destdir && find . -type f) |\
grep "^\-\./" | sed 's/^-//'

This compares the list of files in sourcedir with the ones in destdir and then prints out only the files that exist in sourcedir but not in destdir.

shevron
  • 3,463
  • 2
  • 23
  • 35