0

The file structure looks like this:

/email1/spam

/email2/spam

/email3/spam ...

Now, copy all files under all 'spam' directories to a new directory called /email_data/spam

I tried to use shutil.copytree, but it only copy the first directory (copytree requires the destination must not exists).

Then I tried distutils.dir_util.copy_tree, it works, but I don't know why everytime after its copy, there will be some duplicated files. (e.g. spam_email.txt, spam_email_1.txt). There should be 15045 files, but the code copy 16545 which 1500 more...

J.Pei
  • 141
  • 2
  • 11
  • Is there any specific reason why Python is needed? Other utilities like `rsync` might be a better fit. – metatoaster Oct 31 '18 at 02:24
  • Yes, I have to use python to do that. – J.Pei Oct 31 '18 at 02:25
  • 1
    Possible duplicate of [How do I copy an entire directory of files into an existing directory using Python?](https://stackoverflow.com/questions/1868714/how-do-i-copy-an-entire-directory-of-files-into-an-existing-directory-using-pyth) – metatoaster Oct 31 '18 at 02:33

3 Answers3

0

You can do os.walk and ask if the root is spam, and only then copy the file with shutil.

Maybe not the most efficient way, but quite reasonable. Another way is to use os.system, for example with:

find . --path *spam | xargs -I {} cp -r {} ./spam

Haven't verified.

Dadep
  • 2,796
  • 5
  • 27
  • 40
user2679290
  • 144
  • 9
0

Finally, I found rsync is very easy to use, as metatoaster said, just use os.system(command).

Actually distutils.dir_util.copy_tree works as well, there is no duplication error of copy, the source directory itself has duplicated files...

J.Pei
  • 141
  • 2
  • 11
0

I would suggest to take the shutil.copytree() & shutil.copy() method like described here in combinatin with rsync see this

Here is a not tested full example for copy pasting:

#!/usr/bin/env python3
import fileinput, os, fnmatch, re, io, errno, shutil
import ignore_patterns from shutil

errorMsgSrc = "Source File does exists, continuing using rsync ..."

def CopyFolder(src, dest):
    try:
        if not os.path.exists(dest):
            shutil.copytree(src, dest, ignore= ignore_patterns('*.json', '*.css', '*.scss', '*.js', '*.jpg', '*.png', '*.xcf'))
            print(errorMsgSrc.rstrip())
        if os.path.exists(dest):
            # Now choose your weapons for overwriting
            # maybe you wanna change working directory with e.g., os.chdir(dir)
            # -arv (archive, recursively and verbose)
            # make sure you got the slashes correct here
            assert os.system("rsync -arv " + src + " " + dest), "ERROR rsync step failed"
            # either delete the source file or use rsync with os.system

    except OSError as e:
    # If the error was caused because the source wasn't a directory
        if e.errno == errno.ENOTDIR:
            shutil.copy(src, dest)
        else:
            print('Directory not copied. Error: %s' % e)

if __name__ == '__main__':
    CopyFolder("source/", "~/home/usr/Desktop/")
JasParkety
  • 131
  • 1
  • 6