26

I have read in several places that it's possible to share the objects directory between multiple git repositories, e.g. with symbolic links. I would like to do this to share the object databases between several bare repositories in the same directory:

shared-objects-database/
foo.git/
  objects -> ../shared-objects-database
bar.git/
  objects -> ../shared-objects-database
baz.git/
  objects -> ../shared-objects-database

(I'm doing this because there are going to be lots of large blobs redundantly stored in each objects directory otherwise.)

My concern about this is that when using these repositories, git gc will be called automatically and cause objects which are unreachable from one repository to be pruned, making the other repositories incomplete. Is there any easy way of ensuring that this doesn't happen? For example, is there a config option that would force --no-prune to be the default for git gc, and, if so, would that be sufficient to use this setup without risking losing data?

At the moment, I've been using the objects/info/alternates mechanism to share objects between these repositories, but maintaining these pointers from each repository to all the others is a bit hacky.

(My other alternative is to just to have a single bare repository, with all the branches of foo.git, bar.git and baz.git named foo-master, foo-testing, bar-master, etc. However, that'd be a bit more work to manage, so if the symlinked objects directory can work safely, I'd rather do that.)

You might guess that this is one of those Using Git For What It Was Not Intended use cases, but I hope the question is clear and valid nonetheless ;)

Mark Longair
  • 446,582
  • 72
  • 411
  • 327
  • I'm curious why it's more work to manage extra refs within one repository. (Also, you can name them foo/master, foo/testing, bar/master - bit better for organization. You can see from the history of git.git that they use that kind of setup.) – Cascabel Mar 04 '10 at 01:14
  • 1
    OK :) I have a large USB disk with a similar repository structure to that described above, and on each of the computers (e.g "foo") a symlink ~/.git -> /media/big-disk/foo.git - I'm using a modified version of gibak for backup and "time-travel" through the history of my home directory on each of these computers when the disk is plugged in. If I had a single repository with different branches, I'd need an extra step after plugging in (changing HEAD manually or "git checkout --leave-my-working-tree-alone foo-master" (?)) before things like "git diff" would work in the obvious way. – Mark Longair Mar 04 '10 at 15:22
  • 2
    You might also be interested in [git-new-workdir](https://github.com/git/git/blob/master/contrib/workdir/git-new-workdir), which sounds like it works for my similar use-case (multiple checkouts of possibly-unrelated branches in the same repo, only slightly ew!). It symlinks `refs` and `packed-refs` which should stop `git gc` from nuking anything that's been committed; you just need to point each HEAD at a different branch. Stuff in your index is another issue, but if there's nothing important that isn't in the repo or your working tree, `rm .git/index; git reset HEAD` seems to do the trick. – tc. Apr 17 '13 at 17:57
  • Googlers might also be interested to know that `git clone --shared` is a way to create such repos: https://stackoverflow.com/questions/23304374/what-are-the-differences-between-git-clone-shared-and-reference – Ciro Santilli OurBigBook.com Dec 06 '18 at 14:32

2 Answers2

17

Perhaps this was added to git after this question was asked/answered: it seems there is now a way to do this explicitly. It's described here:

https://git.wiki.kernel.org/index.php/Git_FAQ#How_to_share_objects_between_existing_repositories.3F

How to share objects between existing repositories? Do

echo "/source/git/project/.git/objects/" > .git/objects/info/alternates

and then follow it up with

git repack -a -d -l

where the -l means that it will only put ''local'' objects in the pack-file (strictly speaking, it will put any loose objects from the alternate tree too, so you'll have a fully packed archive, but it won't duplicate objects that are already packed in the alternate tree).

Tobias Kienzler
  • 25,759
  • 22
  • 127
  • 221
Chris Leishman
  • 1,777
  • 13
  • 19
  • 5
    I was tempted to downvote this link-only answer. But then I thought "Meh, let's edit it in" – Tobias Kienzler Jan 27 '17 at 13:29
  • 2
    The question said this was the technique they were already using. It's a good solution in some cases, but for others it is problematic. This approach creates a one-way sharing, so once you start fetching via the individual repos they accumulate their own duplicative objects. It doesn't truly achieve the goal of eliminating duplicates unless only you _only_ fetch from the central objects repo – The Mighty Chris Oct 09 '18 at 19:14
9

Why not just crank the gc.pruneExpire variable up to never? It's unlikely you'll ever have loose objects 1000 years old that you don't want deleted.

To make sure that the things which really should be pruned do get pruned, you can keep one repo which has all the others as remotes. git gc would be quite safe in that one, since it really knows what is unreachable.

Edit: Okay, I was a bit cavalier about the time limit; as is pointed out in the comments, 1000 years isn't gonna work too well, but the beginning of the epoch would, or never.

Cascabel
  • 479,068
  • 72
  • 370
  • 318
  • Thanks for your answer - in particular the suggestion about how to prune safely. – Mark Longair Mar 04 '10 at 15:26
  • 2
    Jefromi: perhaps you could update your answer with a couple of notes? I'm not 100% sure of this, but playing with the tests in git.git, I suspect that values over 39 years may not work, since they go back before the beginning of the epoch. However, since this commit: http://github.com/git/git/commit/cbf731ed4ec511f2c32598e03d7865f35881fea2 you can set gc.pruneExpire to "never" and that will work. (From "git tag --contains cbf731ed4ec511f2" it looks as if versions after (and including) v1.7.0.3 should be fine.) – Mark Longair Jun 11 '10 at 07:53