59

My source tree contains several directories which are using Git source control, and I need to tarball the whole tree excluding any references to the Git metadata or custom log files.

I thought I'd have a go using a combination of find/egrep/xargs/tar, but somehow the tar file contains the .git directories and the *.log files.

This is what I have:

find -type f . | egrep -v '\.git|\.log' | xargs tar rvf ~/app.tar

Can someone explain my misunderstanding here? Why is tar processing the files that find and egrep are filtering?

I'm open to other techniques as well.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
zaf
  • 22,776
  • 12
  • 65
  • 95

10 Answers10

107

You will get a nasty surprise when the number of files increase to more than one xargs command: Then you will first make a tar file of the first files and then overwrite the same tar file with the rest of the files.

GNU tar has the --exclude option which will solve this issue:

tar cvf ~/app.tar --exclude .git --exclude "*.log" .
Yuri
  • 4,254
  • 1
  • 29
  • 46
Ole Tange
  • 31,768
  • 5
  • 86
  • 104
58

You can try directly with the tar option --exclude-vcs:

--exclude-vcs:
          Exclude version control system directories

For example:

tar cvfj nameoffile.tar.bz2 directory/ --exclude-vcs

It works with Git.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Ivan
  • 19,560
  • 31
  • 97
  • 141
16

Try something like this:

git archive --format=tar -o ~/tarball.tar -v HEAD

Add your .log files and everything else you don't want to be packed to your .gitignore file.

jkramer
  • 15,440
  • 5
  • 47
  • 48
  • nice, is there a way to get a snapshot of code without making a commit tho? I want to archive the current directory even if files are not staged or committed? – Alexander Mills May 03 '19 at 22:32
13

To exclude version control system directories:

tar --exclude-vcs

This will exclude svn, git metafiles etc.

PeloNZ
  • 460
  • 5
  • 12
5

The newer GNU tar has the option to exclude version control directories automatically by using flag --exclude-vcs . This will take care of .git as well.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
KeshV
  • 626
  • 7
  • 8
4

[slap] Bah! The parameters to find were in the wrong order! I didn't see the warnings because they whizzed off the screen. This allowed '.' to pass through egrep which caused tar to slurp up everything.

That will teach me for drowning important messages in verbose debug.

This works:

find . -type f | egrep -v '\.git|\.log' | xargs tar cvf ~/app.tar
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
zaf
  • 22,776
  • 12
  • 65
  • 95
3

git-archive may be what you're looking for.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Skilldrick
  • 69,215
  • 34
  • 177
  • 229
3

You could do that without grep. find is powerful

 find . -type f -not \( -iname ".git" -or -iname ".log" \) | xargs ...
Jürgen Steinblock
  • 30,746
  • 24
  • 119
  • 189
2

For doing it from outside the app directory:

tar cvfz app.tar.gz --exclude ".git/*" --exclude ".git" app/
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
1

For me, the .gitignore content is what I needed:

tar cvfz $PROJECT.tar.gz --exclude-from=$PROJECT/.gitignore $PROJECT

--exclude-from is reading the file's contents - and exlucde the patterns listed there

Ricky Levi
  • 7,298
  • 1
  • 57
  • 65