24

What is the easiest way to do incremental backups of a git repository with git bundle?

If I just wanted to backup a single branch, I could do something along these lines:

git bundle create foo last-backup..master
git tag -f last-backup master

But what if I want to backup everything (including all branches)?


To answer the questions in the comments:

Strictly speaking, I do not need to use the usual Git bundles, as long as the solution satisfies the following properties:

  • Each incremental backup is a single file. I can store it somewhere and subsequent incremental backups do not need to modify this file.

  • The size of the file is approximately equal to the total size of Git commits since the previous backup. Changes in binary files are also efficiently stored.

  • A full backup + all incremental backups since then contain everything that I need to automatically reconstruct the repository, including all branches.

(As a naive example, simply constructing a tar archive with recently-changed files in the git repository fails to satisfy the second requirement if, for example, automatic garbage collection has occurred.)

And ideally I would also like to have a system that is idiot-proof:

  • I can take virtually any full backup of my Git repository, plus all recent incremental backups, and I can simply "pull" everything from the backups and the repository will be up-to-date. In particular, it does not matter if there is a partial overlap between the full backup and incremental backups.

Git bundles satisfy all this very nicely if I only need to handle one branch.

Jukka Suomela
  • 12,070
  • 6
  • 40
  • 46
  • Why do you need to use git bundles for backups? Is that a PHB requirement? – user1338062 Aug 29 '12 at 10:29
  • @user1338062: Bundles would be ideal for my workflow. If you have an alternative solution that achieves all the nice properties of Git bundles, I would be happy to hear. – Jukka Suomela Aug 29 '12 at 14:10
  • You probably should expand the question a bit with what are your requirements for git repository backup. – user1338062 Aug 29 '12 at 14:43
  • This question (and the answers there) seem to be somewhat relevant: http://stackoverflow.com/questions/3635952/how-to-use-git-bundle-for-keeping-development-in-sync – Jukka Suomela Aug 31 '12 at 12:48

4 Answers4

15

(We discussed this problem with Jukka, this is the outcome.)

Preliminaries:

  1. Have the last backup available as backup.bundle
  2. Have a remote backup that points at backup.bundle

Making a backup:

  1. git fetch backup – just to make sure we're up to date
  2. git bundle create newbackup.bundle ^backup/A ^backup/B A B C
    • This means we create a bundle that excludes all the stuff that was already in the bundle
    • It's easy to generate the needed ^backup/A-style arguments from refs/remotes/backup/
    • Likewise the A-style arguments are from refs/heads
  3. copy newbackup.bundle to wherever you keep your backups
  4. replace backup.bundle with newbackup.bundle so you know where to start the next incremental backup

Recovering:

  1. Have a repository that is either empty or represents an old version of your repository
  2. For every backup file, in sequence:
    1. git remote rm recovery
    2. git remote add recovery <name-of-bundle>
    3. git fetch recovery – you need to name the remote for this to work
  3. Now you should have every branch available in refs/remotes/backup
opqdonut
  • 5,119
  • 22
  • 25
  • Reading the `.git/refs/heads` and `.git/refs/remotes/backup` directories does not seem to work as intended, so I ended up invoking `git branch` and `git branch -r` and parsing the output. Other than that, this basic approach seems to work perfectly. – Jukka Suomela Sep 12 '12 at 11:38
  • I think I understand what this approach is doing, I just don't understand the "It's easy to generate..." statements. How would I generate the excludes from refs/remotes/backup (and presumably the includes from refs/heads?) It is just a matter of doing an ls() on the directory and constructing the arguments from that? – Mark E. Hamilton Mar 02 '16 at 01:18
6

Try using --since with --all.

Create the first backup:

git bundle create mybundle-all --all

Do an incremental backup:

git bundle create mybundle-inc --since=10.days --all

The incremental should contain all commits on all branches that have happened in the past 10 days. Make sure the --since parameter goes back far enough or you might miss a commit. Git will also refuse to create the bundle if no commits have happened in that time-frame, so plan for that.

onionjake
  • 3,905
  • 27
  • 46
  • 1
    This is a good idea, but I wonder if I could easily use, e.g., tags instead of trying to calculate the right number of days for the `--since` flag? – Jukka Suomela Aug 31 '12 at 11:24
  • 1
    You could specify a date e.g. `--since {2012-08-30}` and encode the date into last increment bundle file name. – kan Aug 31 '12 at 15:26
1

You could do

git clone --mirror <your_repo> my-backup.git

It will create a bare repo with all refs.

Then you could periodically do git push --mirror <my-backup>.

kan
  • 28,279
  • 7
  • 71
  • 101
-1

Seems opqdonut solution will not work, cause ^backup/A ^backup/B only points to last incremental backup. And actually need to exclude refs from all previous incremental backups.

Need to create remotes for each of previous bundles.

UPD: No, it should work, see Jukka comment below.

vsespb
  • 134
  • 1
  • 9
  • Example: two branches: A, B (to simplify: no common commits between) first 10 backups: only A modified; 11th backup: B modified ;12th backup last bundle contains only "B" refs, so new bundle will contain all "A" refs (including those which were in first 10 backups) – vsespb Mar 01 '13 at 23:51
  • UPD: here is example implementation https://github.com/vsespb/git-incremental-backup – vsespb Mar 01 '13 at 23:52
  • In your scenario, after the 11th backup, both backup/A and backup/B will exist, and both of them will be up-to-date. Fetching from the 11th bundle does not delete backup/A. Note that you will never *delete* the remote called "backup", you just fetch new things from the new bundles. – Jukka Suomela Mar 02 '13 at 00:48
  • Yo're right. I was deleting the remote. That was the problem. – vsespb Mar 02 '13 at 08:21