6

I'm already familiar with answers to the general question, how to manage large binary files in git (or a large number of files). And I've looked at git-annex, bup, and git-media. None claim very good access for the Windows user.

Are there any such programs that run well on Windows?


To add context, I'm trying to version control my system deployment bits: OS images, drivers, 3rd party installers, 1st party installers (our applications). I need to have everything in a coherent bundle (tags). And be able to get the entire bundle for any of our active releases.

Anthony Mastrean
  • 21,850
  • 21
  • 110
  • 188
  • I have updated my answer to address your context. – VonC Aug 12 '11 at 06:26
  • git-annex appears to be coming along on Windows, and the [assistant](http://git-annex.branchable.com/assistant/) for it is getting ported to Windows this month. Soon, maybe! It's a bit silly that this kind of thing didn't exist years ago. – iono May 29 '13 at 08:33

2 Answers2

3

Not that I know of, and as you know, large binary files aren't compatible with:

  • VCS (since there is no sense in diff'ing them or merging them)
  • DVCS (where a distributed repo is meant to be cloned, which is quickly cumbersome with an obese repo)

The only remaining solution (OS agnostic actually) remains an external artifact repository (like a Nexus one for instance) to store those binaries.


The OP Anthony Mastrean adds that he needs to:

version control my system deployment bits: OS images, drivers, 3rd party installers, 1st party installers (our applications).
I need to have everything in a coherent bundle (tags). And be able to get the entire bundle for any of our active releases

That would be mixing:

  • development requirements (versioning, meaning branching and comparing versions)
  • with deployment requirements (getting all the right labels in order to deploy and run)

Anything which isn't developed (i.e anything built or already existing) should be out of a VCS (except for very small resources, like icons for instance, which don't change much).

What you usually version is a "release file" which contains all the extra information (checksums, path to other referentials, ...) for the deployment script to operate, fetching the right artifacts.

Community
  • 1
  • 1
VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
  • So, we know it's ok to version development bits. And what comes out of that are release files. You're saying to just dump them out on a known share with some kind of labeling scheme (version number, release name, etc). And version the deploy scripts, which act on the release files? – Anthony Mastrean Aug 12 '11 at 13:31
  • @Anthony: you just described what maven does ;) the `pom.xml` is versioned, but all the artifact are stored in a Maven repo, which is nothing more than a share directory with a structure following a precise naming convention (group,artifact,version). So even if your project has nothing to do with a java-maven project, the idea behind those release policies stands. – VonC Aug 12 '11 at 14:03
  • Perforce supports large files just fine, and is an excellent Version Control System. – ptman May 24 '12 at 13:41
  • @ptman it is an excellent VCS indeed, but since it is a *Distributed* one, it is best to not put large file in it, in order to keep the clone reasonable. – VonC May 24 '12 at 14:29
  • Ok, see I was just commenting on you saying that "large binary files aren't compatible with VCS". Since Perforce is a VCS that handles large binary files fine, that cannot be true. But then you reply with calling Perforce a distributed VCS, which it is not. So instead of correcting the misinformation, you provided another false claim. – ptman May 24 '12 at 14:57
  • @ptman: very sorry, I was doing too much tasks at once. I was thinking about Mercurial! Yes, Perforce is a Centralize VCS, able to cope with large files indeed. – VonC May 24 '12 at 15:50
  • @ptman My "large files aren't compatible" bit is about VCS feature (diff and merge) which are *usually* not supported for binaries (even though certain types can be supported). Or they require special merge resolution process (like a `p4 resolve -t -as` for instance: http://kb.perforce.com/article/150/resolving-binary-files) – VonC May 24 '12 at 15:53
2

There is not good way to manage large binary files in GIT or any other version control system. The consensus is that you need digital asset management systems to do this. Digital assets are things like photos, sound clips, videos, etc.

There are a number of Open Source DAM packages out there and this page has a review of all the major ones http://www.opensourcedigitalassetmanagement.org/

If you don't need support for versioning, lot's of people build quick solutions using something like MongoDB for storage.

Michael Dillon
  • 31,973
  • 6
  • 70
  • 106