2

I'd like to specify full paths to ignorable files and directories when calling shutil.copytree(). Something like

def my_ignore(dir, files):

    # return ["exclude.file"] # working

    return ["/full_path_to/exclude.file"] # Not working

shutil.copytree(src, dest, ignore=my_ignore)

After this, the excluded file is still there unless I return simply the filename instead of full path. The thing is I really want to set up a particular file instead of all matching filenames under different directories.

I referred to a number of questions here, such as: How to write a call back function for ignore in shutil.copytree

Filter directory when using shutil.copytree?

But none of the answers work. It looks like the ignore hook can only return a glob-style and any constructed full path will not work.

Am I missing something?

kaya3
  • 47,440
  • 4
  • 68
  • 97
kakyo
  • 10,460
  • 14
  • 76
  • 140

2 Answers2

1

ignore indeed must return just the filenames that are ignored. However, the function is called for each directory shutil.copytree() visits; you get to ignore files per directory.

If you have a full path to a file you need to ignore, then match against the first parameter passed to your ignore function; it is the full path to that directory:

def my_ignore(dir, files):
    if dir == '/full_path_to':
        return {"exclude.file"}

I return a set here; set membership testing is faster than with a list.

If you have a predefined set of paths to ignore, parse those out into a dictionary; keys are the directory path, values sets of filenames in that path:

from collections import defaultdict

to_ignore = defaultdict(set)
for path in ignored_paths:
    dirname, filename = os.path.split(path)
    to_ignore[dirname].add(filename)

def my_ignore(src, files):
    return to_ignore.get(src, set())
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • Membership testing in a set isn't faster than with a list when there's only one item. :-) – kindall Jul 09 '13 at 17:50
  • @kindall: I doubt this is limited to just one filename at a time though. – Martijn Pieters Jul 09 '13 at 17:53
  • @MartijnPieters Is ignored_paths a list of paths? I tried your code and got an error: File "/HelloCopytree.py", line 28, in my_ignore2 return to_ignore.get(folder, set()) TypeError: get() takes no keyword arguments – kakyo Jul 09 '13 at 19:42
  • My bad: I put default=set() as a keyword arg there. Your code works fine. Thanks! – kakyo Jul 09 '13 at 19:49
1

It's not magic. copytree() copies the contents of one directory at a time and it specifically looks for filenames in the ignore list you return. A full path is never the name of a file, so it is never matched.

However, the dir parameter will help you do what you want:

def my_ignore(dir, files):
    if dir == "/full/path/to":
        return ["exclude.file"]
    else:
        return []
kindall
  • 178,883
  • 35
  • 278
  • 309