4

I am writing a backup script which uses tarfile module. I am a beginner in python. Here is part of my script - So I have a list of paths that need to be archived in tar.gz. seeing this post, I came up with following. Now archive gets created but the files with .tmp and .data extension aren't getting omitted. I am using python 3.5

L = [path1, path2, path3, path4, path5]
exclude_files = [".tmp", ".data"]
# print L

def filter_function(tarinfo):
     if tarinfo.name in exclude_files:
          return None
     else:
          return tarinfo

with tarfile.open("backup.tar.gz", "w:gz") as tar:
     for name in L:
        tar.add(name, filter=filter_function)
Community
  • 1
  • 1
akya
  • 99
  • 1
  • 10

1 Answers1

2

you're comparing the extensions vs the full names.

Just use os.path.splitext and compare the extension:

 if os.path.splitext(tarinfo.name)[1] in exclude_files:

shorter: rewrite your add line with a ternary expression and a lambda to avoid the helper function:

tar.add(name, filter=lambda tarinfo: None if os.path.splitext(tarinfo.name)[1] in exclude_files else tarinfo)
Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219
  • Thanks. that worked. I have a question, if I have to add a path to the exclusion list, how would that work? – akya Apr 14 '17 at 09:27
  • that would be a relative path then. You'd perform a `os.path.dirname(tarinfo.name)` and compare with the directory you want to exclude. I suggest that you print the 2 parts you're comparing in your function so you see if there's a chance that they match. Had you do it in the first place you'd have seen that you were comparing extensions with the full name. – Jean-François Fabre Apr 14 '17 at 09:28
  • I wrote this instead `L = [path1, path2, path3, path4, path5] exclude_files = [".tmp", ".data", "/media/Data/Textfiles/Linux", "/media/Data/Textfiles/Old/Pushbullet"] def exclude_function(filename): if filename in exclude_files or os.path.splitext(filename)[1] in exclude_files: return True else: return False with tarfile.open("backup.tar.gz", "w:gz") as tar: for name in L: tar.add(name, exclude=exclude_function)` – akya Apr 14 '17 at 11:53
  • Somehow I am not getting the filter part. The above code works for every extension file or path mentioned in exclude. and about the shorter code, I only started learning python a few days ago, So i am writing code as expanded as possible. Once I get familiar with the language more, I will try to be more concise. – akya Apr 14 '17 at 11:54
  • make 2 lists: one for path exclusions and one for extension exclusions. And I doubt that the tarfile contains absolute paths. Print them in your filter function to see what the filter takes in input. – Jean-François Fabre Apr 14 '17 at 12:03