1

I have code that is currently 'open source' but not easily accessible since it is not available from a repository. The idea is to make it available via either SourceForge or Github, but I am willing to use whatever free site that supports the requirements.

Project description

The project consists of code in many modules (Java packages under the org.pscode hierarchy), and some dependencies (e.g. 50 meg of cross-compilation plug-in, 5-10 meg of MP3..). The project contains many stand-alone applications as disparate as a cross-compilation compiler meant for developers, and a music juke box for end users. But there are also single classes (such as BigClip that can hold a large sound clip), that are useful in other applications that are not from the project itself. While some of the packages are effectively 'self contained' (e.g. JaNeLA), others are components used across a range of existing applications.

The entire project as it exists on my machine is already pushing 200 Meg.

As a developer who wanted to play a long clip, I would be hesitant to download 200+ megabytes of project just to get a class who's code is short enough to post on SO.

Code repository requirements

  • A code repository/sharing system that allows the user to download only the parts they require.
  • If it consists of many separate parts of the main project, they are downloaded automatically.
  • Provide an automatic way to calculate (for display to the user) how much download is required for each 'sub' project. OK, that is not a requirement that will 'make or break' my choice, but it would be very handy.

I was looking at Git since it seems to be have many advances over the older forms of CVS, and was reading Git For Eclipse Users & got to point 3 of Distributed Version Control Systems which starts..

Given that there is no master repository, it becomes clear that the repository must live in its entirety on each of the nodes in the DVCS. ..

That worries me, but I am not sure I fully understand it, or whether there is another mechanism within Git to provide the behavior required.

Question(s)

Can Git fulfill the requirements stated above?

On another tack, is this a 'non-question' for most developers? If people will typically download 200 meg of project just to get 3 Kb of code to parse a file of Comma-separated values, then perhaps I am worrying over nothing.

Community
  • 1
  • 1
Andrew Thompson
  • 168,117
  • 40
  • 217
  • 433

2 Answers2

2

You can use git submodules. A submodule in a git repository is a reference to another git repository. Normally, when you clone a git repository, you must copy everything. However, submodules are a way to split up a large project into many separate repositories.

Note however, that a submodule will need to be in a separate subdirectory, so if you have source code you want in separate submodules, you might need to move them around, and perhaps change your linking.

When you clone the main project, it will not download all submodules automatically. You need to specify that you want to pull all submodules manually. However, it is possible to quickly download all submodules, for more details:

Easy way pull latest of all submodules

Community
  • 1
  • 1
ronalchn
  • 12,225
  • 10
  • 51
  • 61
  • 1
    Thanks. I was also looking at some documentation for sub-modules but quickly got the (apparently wrong) impression that they were not what I was looking for. I'll do some more reading and report back. – Andrew Thompson Aug 25 '12 at 03:36
1

This is really not possible with Git. Although, if you host your code on a Git repository such as github, users can browse through your files and download the ones which they deem necessary. See Git vs Subversion:

Git requires you to clone the entire repository (including history) and create a working copy that mirrors at least a subset of the items under version control

Although Git's submodule feature can help with this problem, you must manage multiple repositories and ensure coordination between them. This can be somewhat messy if you are still mainly in development, and may require some refactoring.

smang
  • 1,187
  • 10
  • 23
  • (From link) *"Git was designed from the ground up as a distributed version control system."* I cannot stress enough how much the 'distributed' part of that is not a requirement. It is not a concept that had occurred to me until today. If someone wants to set up a fork of the code base they are welcome to, but I do not care how much effort it takes them. That is making me look at Subversion more closely. Thanks for the link. – Andrew Thompson Aug 25 '12 at 04:08
  • Not true. You can just do a checkout if you are not interested in history and you're done. Naturally you can checkout a single file, too. So distribution with git is optional, not a requirement. Normally you want history and distribution, so you use git that way. But you are not forced to. – hakre Aug 29 '12 at 15:47
  • I believe you must do a clone before you can do a checkout. In that case, you will get the full history. – smang Aug 31 '12 at 04:48