... From what I have read I think I need to change the commit at which one of my submodules is pulling.
Maybe. It may also be the case that you just need to poke someone to update the upstream submodule. Or, maybe you should just use a different commit in the superproject. Or maybe your submodule is cloned as a shallow repository, but should not be. (I think this last is the most likely, based on the fact that https://github.com/ringo/ringojs/tree/298e62daa64923b7bc1e4a085233529f907ba7bf exists.)
If the problem is in fact a shallow clone, navigating to the submodule repository and running git fetch --unshallow
should fix it. You could use git submodule foreach git fetch --unshallow
to do that.
Background
You probably already know what a Git repository is: it's a collection (or database) of commits, with each commit representing a complete, intact snapshot of an entire source tree. Some particular commits are especially important, either right now, or always, so these commits have names: a branch name, like master
, names the latest commit on that branch. It's important because it's new! or shiny! or whatever. Meanwhile, a tag name, like v1.2
, names a commit some person thought was important, such as a stable release.
Each of these names—branch or tag, or really, any other human-readable name you can use in a Git repository like origin/master
or whatever—is actually just a name for a raw hash ID. These hash IDs, which include things like 298e62daa64923b7bc1e4a085233529f907ba7bf
, are apparently-random, big ugly hexadecimal numbers that are useless to humans. You have just seen an example of how such a number is not useful to you. But they are what Git uses to check out specific commits. When you use a name like master
or v1.2
, Git translates that name into the correct hash ID, and checks out that commit.
Because a Git repository is, at least normally—this becomes important soon, fully self contained, Git can make sure that all the names are valid and identify valid commits. When you tell Git: check out master
, it's never the case that the name master
exists and names commit a123456...
and yet that commit doesn't exist. Either the commit does exist, and master
can name it, or the commit doesn't exist, and master cannot name it.1
When you use a branch name to check out one specific commit—by running git checkout master
or git checkout develop
, for instance—Git turns the name into the hash ID, locates that commit in the database, and extracts that commit into a work-tree where you can use it and/or work on it. The commits inside the database are in a form usable only by Git itself, so without a work-tree, you could not do any work. At the same time that Git extracts the commit, Git remembers the name for you, so that you are now "on a branch".
You can, however, also select any historical commit you like, by its raw hash ID, and run git checkout a123456...
. That commit must exist of course, but assuming it does, Git extracts that commit into your work-tree, and remembers that you're not on any branch now. Instead, Git says that you have a "detached HEAD".
In general, you get a Git repository by cloning:
git clone <url> <directory>
clones the repository found at the given URL, and puts it in a particular directory. The last step of git clone
is to git checkout
a branch or tag—often the branch name master
, but you can add arguments to git clone
to say what to check out.
1This gives rise to the interesting case of a new, completely-empty Git repository. In such a Git repository, there are no commits yet. This means there are no branches! The branch name master
does not exist yet, in a new, completely-empty repository. The very first commit you create becomes the latest commit on branch master
, and in so doing, also creates the name master
. This is because the name must always contain a valid commit hash ID.
Submodules
With all that in mind, let's take a moment to describe what a submodule is. The short version is that a submodule is just another Git repository. The only thing special about the submodule is that it exists because some other Git repository—which Git calls the superproject—says: "while I am my own Git repository, I would like to use another Git repository too." The superproject is required to supply the information you would have passed to git clone
: the URL, and the path.
If you start working in the submodule, though, you find that it's almost always in the special "detached HEAD" state. That's because instead of checking out a branch or a tag, the superproject also tells the submodule Git: and by the way, after you've cloned or fetched everything, I want one specific commit and here is the hash ID: _______. The superproject supplies the raw hash ID—not a name like master
or v1.2
, just a raw hash ID.
This is the source of the error you are seeing. The superproject repository you chose, some version of Ant, lists a submodule repository by name (apparently this ringojs-fork thing). You can clone that other Git repository just fine. But then, your superproject tells your Git system: after getting the latest from ringojs-fork, check out commit 298e62daa64923b7bc1e4a085233529f907ba7bf
. But commit 298e62daa64923b7bc1e4a085233529f907ba7bf
does not exist.
Who's wrong? Is it the superproject, when it says "use commit 298e62daa64923b7bc1e4a085233529f907ba7bf
"? Or is it the submodule, when it says "commit 298e62daa64923b7bc1e4a085233529f907ba7bf
does not exist (yet)"? Or maybe even both are wrong. Well, sort of. The submodule repository is a repository, so it should be self-contained and have everything.
Where things can go wrong
Git is a distributed version control system, meaning there are many copies of every repository. Every clone is a copy, after all. But some clones might be more up to date than others. Suppose someone controlling this ringojs-fork forgot to run git push
to update the GitHub clone. Then that someone might have commit 298e62daa64923b7bc1e4a085233529f907ba7bf
, but have never sent it out for everyone else. In that particular case, you just need to get whoever controls this ringojs-fork to send that commit up, so that you can get it back down.
Or, perhaps 298e62daa64923b7bc1e4a085233529f907ba7bf
existed at one time, and was available for everyone to use. It might even be out there in some clones. But something terrible was in 298e62daa64923b7bc1e4a085233529f907ba7bf
, so whoever controls the ringojs-fork had it carefully excised. In other words, it was there, but isn't any more, and no one should ever ask for it. (This situation is problematic since other people might depend on it, or have it and put it back, or at least try to put it back. It's rarely good to rip commits out of public repositories like this.) In this particular case, we see that your Ant repository—your superproject—depends on it.
Well, more precisely, at least one specific commit in your superproject depends on this commit in the submodule. Maybe other commits in the superproject don't, in which case, if you switch to one of those other commits in the superproject, maybe that will cure the problem. No commit can ever be changed, so the particular commit in the superproject that you are using right now will always ask for this other particular commit (by hash ID) in the submodule. If the submodule's commit should never be asked-for, this particular commit in the superproject should simply never be used.
Or, there's one more possibility. I mentioned above that a repository is normally completely self-contained. There's a special exception, though, that's very common, called a shallow clone. A shallow clone is a clone that deliberately omits a lot of commits, so as to make cloning faster.
Omitting lots of commits to make cloning faster is great, up until someone asks for one of those commits. Now, Git is not totally stupid—it is only just mostly stupid —so a shallow clone that omits some commit, also omits any name for that commit. But superprojects don't ask for commits by name, they ask directly by raw hash IDs. This means that if you make shallow clones of submodules, it's really easy for the superproject to call for a commit that was not copied in the clone step.
The root of the problem is that a superproject lists a submodule's commit by raw hash ID, and the two Git repositories are only loosely coupled. Anything that causes a commit in the superproject repository to list a hash ID that is not available, for whatever reason, in the submodule repository, will lead to this error. That includes the occasional failure to push, but mostly includes cases of shallow cloning.
(Note that if a submodule moves, from one GitHub URL to another for instance, you may have to adjust the submodule clone, which may list the old URL, in the same way you have to adjust the superproject's URL if the superproject's URL changes.)