In our organization where we are trying to introduce Git, we have now a problem related to Git behavior with respect to binary files.
Our projects will have a good mix of binary and text type files and a typical size could be 1 GB. Our fear is that after few years a full clone would become too big and cause performance and disk space issues.
One of the environment that would migrate to Git have their SW currently on a system called TCM. The total size of repositories with versions of 7-10 years is 2 TB.
Another environment on ClearCase has around 7-8 years data of around 1 TB.
With Git not storing in deltas which will particularly affect binary files, a situation post 5+ years is causing concern to our users.
Shallow clone feature would have been ideal. But the docu says this "A shallow repository has a number of limitations (you cannot clone or fetch from it, nor push from nor into it), but is adequate if you are only interested in the recent history of a large project with a long history, and would want to send in fixes as patches.". A cursory check on shallow clones would show that it works fine, but definitely there are known use-cases where it wont work, hence document
Is there a known list of use-cases where this wont work?
Asked
Active
Viewed 1,746 times
7

maxmelbin
- 2,045
- 3
- 21
- 29
-
3Update: Since git V1.9, most limitations of shallow clones have been resolved. – sleske Jun 01 '14 at 18:12
-
Git 2.5 (Q2 2015) supports a single fetch commit! I have edited my answer below, now referencing "[Pull a specific commit from a remote git repository](http://stackoverflow.com/a/30701724/6309)". – VonC Jun 08 '15 at 05:34
-
VTC as unclear. The docs give a full summary of what it can't do. A "list of use-cases" is whatever you can imagine that uses those operations. Which is an infinite set, thus impossible to put into an answer. – ivan_pozdeev Dec 25 '17 at 17:47
2 Answers
5
I would urge you to store binary files in a dedicated repository, easy to scale and easy to clean up: an artifact repo like Nexus.
You have other alternatives in "How to handle a large git repository?".
Trying to keep everything in Git, using it in some unnatural way, will always result in more trouble that is is worth: it is a source control tool. You might as well use it for what it is good for.
That being said, a shallow clone doesn't support push (or, at least, it is dangerous: see "Why can't I push from a shallow clone?").
For read-only purpose, a simple git archive
would be enough, as mentioned in "not understanding git shallow clone".
Updates 2015:
- actually, you now can use a shallow clone for push/pull, see "Is
git clone --depth 1
(shallow clone) more useful than it makes out?" - you can (with Git 2.5+) even fetch a single commit. See "Pull a specific commit from a remote git repository".
1
Git Annex solves the "big-binary-files in/near git" problem quite beautifully, as well.

Andreas Klöckner
- 1,086
- 8
- 11