Git pull: error: Server does not allow request for unadvertised object

Question

I have two projects: original one and its fork.

I'm trying to pull all the latest changes from original project's git repository into fork's one.

git pull ssh://site/original.git

This gives me the following strange error:

* branch              master     -> FETCH_HEAD
Fetching submodule vmscl
Server refused to set environment variables
From ssh://site/vmscl
= [up to date]      master     -> origin/master
Server refused to set environment variables
error: Server does not allow request for unadvertised object 8cc8573b39c9efde42a77701b91b3b6dcbb6b7ac
Errors during submodule fetch:
vmscl

There are no custom modifications for vmscl submodule in the fork project. So it should pull without any issue just by means of fast forwarding. At the same time, I do not know where it gets this 8cc8573b39c9efde42a77701b91b3b6dcbb6b7ac from. I can't find commit with such hash anywhere.

Sounds like some git bug / corruption of my local repository.

I've tried everything from here and from here. Nothing helped.

What if you set a remote (e.g. upstream) with the original repo's url then just pull the repo's master branch into your fork's repo e.g. `git pull upstream master`? — Sajib Khan, Aug 28 '21 at 15:09

score 1 · Answer 1 · answered Aug 29 '21 at 09:56

At the same time, I do not know where it gets this 8cc8573b39c9efde42a77701b91b3b6dcbb6b7ac from.

That hash ID comes from a gitlink in the superproject.

Whenever you're using submodules, you are using at least two Git repositories. We call one the superproject and one the submodule. The superproject Git repository simply "lives above" the submodule Git repository, in terms of the checked-out working trees. For instance, if you run:

git clone ssh://github.com/org/super.git

to clone the superproject, you wind up with super/ in the current directory containing the cloned repository (super/.git) and a working tree (everything that's not .git within super/).

The superproject repository checkout in the super/ directory will contain a file named .gitmodules. Inside this file, the superproject stores the URL for the submodule. The Git running git checkout in the superproject checks out some particular commit into the super/ directory. This fills in that Git's index while also filling in your working tree, in the super/ directory. So we now:

cd super/

to get into the superproject.

Let's suppose that the submodule ssh://github.com/org/sub.git is supposed to be cloned into the path lib/sub here in the working tree in super/ that we just cd-ed into. The superproject Git will have made an empty directory, lib/sub,¹ in the working tree. There is as yet no clone of sub.git in this repository,² so you now have to run:

git submodule update --init

This reads both the .gitmodules file and the index.

As I mentioned above, the initial checkout filled in both Git's index and your working tree. The index—or staging area, as it is increasingly being called these days—is a data structure that other version control systems keep hidden, but Git exposes to you, the programmer, and makes you understand it.³ If you don't understand it yet, you had best learn about it now: jump to the appended section describing the index, then return here. (I'd put in a link but StackOverflow won't let you insert HTML anchors in an answer.)

Anyway, in the index, for each submodule that this particular superproject "links to", there is an entry that Git calls a gitlink. The gitlink contains two parts:

there is a file path, in this case lib/sub; and
there is a big ugly hash ID: in your case, 8cc8573b39c9efde42a77701b91b3b6dcbb6b7ac.

Running git submodule update --init looks inside lib/sub and sees that it is empty, so this now reads the .gitmodules file to find what git clone command to run, then runs a new git clone:

git clone -n ssh://github.com/org/sub.git lib/sub

for instance. The -n option here is short for --no-checkout: the initial clone is run without any checkout. We'll see why in just a moment.

There is one extra wrinkle in modern Git, which stuffs the .git repository into a special directory under the .git repository in super/.git, and creates instead a lib/sub/.git file to let the submodule find the superproject's hiding places. In the near future there will also be some additional breadcrumbs left behind so that Git work in the submodule can "know" that the submodule is being used as a submodule. At the moment, though, once the submodule clone happens, the submodule is quite unaware of the fact that it is a submodule. As far as the submodule in lib/sub is concerned, it's just a regular old Git repository. It just has not yet checked anything out.

Now that the clone exists, git submodule update (with or without --init: the --init just does the clone step if needed) uses the value it read out of the gitlink to do something. Exactly what something is done at this point is complicated, but let's address the usual case:

The usual case is that the superproject Git runs the equivalent of:
```
(cd lib/sub && git checkout $hash)
```
where $hash is the hash ID read from the gitlink. This puts the submodule into detached HEAD mode, on the given commit—provided that the submodule clone has that commit, that is.

Adding --remote to the git submodule update command makes git submodule update run git fetch in the submodule, and then read out one of the origin/name branch names updated by this git fetch. This hash value replaces the $hash above. That is:
```
(cd lib/sub && git fetch && git rev-parse origin/master)
```
for instance. If this works, the output from the rev-parse is goes into $hash, replacing the hash ID the superproject Git got from its index.
If the $hash can't be found, and we haven't just run git fetch in the submodule, a modern Git will run:
```
(cd lib/sub && git fetch origin $hash)
```
This makes a by-hash-ID request to the server that serves the submodule Git repository, as recorded in the URL recorded under remote.origin.url by the earlier git clone step.
Not all servers allow fetching by hash ID. This was the case with your server:
```
error: Server does not allow request for unadvertised object 8cc85...
```
In this case, the git submodule update fails.

It seems as though either you have recursive checkout turned on, or you have some script that is running git pull for you. The output line:

Server refused to set environment variables

does not seem to be in the Git source code at all, which is odd.

¹This observation—that superprojects cause empty directories to be created—is at the heart of the trick of using an empty submodule to store an empty directory. See this answer to How can I add a blank directory to a Git repository?

²If you set the recursive checkout option, the initial checkout runs git submodule update --init for you here. I am describing the setup where you have to run it manually, since that's more instructive.

³You can sort of get away without learning about it for a while, especially if you use git commit -a. But some things in Git are simply inexplicable unless you know about the index. Don't try to skimp here! You don't need to know every detail, just (a) that it exists and (b) the items in the section below.

Things you might be able to do

You could try cloning without recursion turned on, then using git submodule update --init to get the clones to happen, then enter the failing submodule(s) and just run git fetch. With some luck, a full fetch will bring in the target commit (in this case 8cc8573b39c9efde42a77701b91b3b6dcbb6b7ac): run git cat-file -t 8cc8573b39c9efde42a77701b91b3b6dcbb6b7ac to see if it is now available as an object of type commit and if so, a git checkout -r or git submodule update --recursive in the superproject working tree should now proceed (or at least get further).

You could switch to a server that supports the "any SHA1 in want" request. GitHub's servers do this, for instance.

You could, if you have permission, find the server and reconfigure it to allow "any SHA1 in want":

git config uploadpack.allowAnySHA1InWant true

None of these will work if 8cc8573b39c9efde42a77701b91b3b6dcbb6b7ac refers to a commit that does not now exist in the submodule on the server. This can happen when:

someone creates that commit, but fails to push it to the server; or
someone creates that commit, pushes it to the server, but then later does a force-push to the server that removes access to that commit and it eventually gets garbage-collected.

In these two cases, commit 8cc8573b39c9efde42a77701b91b3b6dcbb6b7ac is not available, and your only choices are:

find someone who does have it, or
choose some other commit in the submodule, such that the submodule checkout produces working software.

Whether simply using the latest version of the submodule software will do this is unpredictable. Perhaps the superproject depends on a bug in the submodule, and that bug is now fixed.

In any case, if you do find a submodule commit that allows you to proceed, you should consider making a new superproject commit that refers to that submodule hash ID. For instance, suppose that checking out the latest origin/master or origin/main commit in the submodule works:

(cd lib/sub && git checkout origin/master)
# build and test software -- it works!

Then you can, at this point, make a new superproject commit that refers to whatever hash ID is checked out in the submodule now, which obviously is available, by doing this:

git add lib/sub
git commit -m "update lib/sub to current version"

Consider your commit message carefully: you want it to convey why you made this commit. "Update lib/sub" gets partway there but is definitely not complete. It may be complete enough, depending on how interested, informed, and intelligent the users of your code base are.

What to know about Git's index AKA staging area

Git's index or staging area is how Git:

keeps track of what you checked out from the current commit;
knows what you intend to put into the next commit; and
keeps track of any merge conflicts, if those occur.

Your initial checkout takes all the files out of some commit—every commit has every file—and copies them to Git's index and your working tree. The "copies" in Git's index are in Git's internal object format.

Keeping track of the current commit

Files inside commits are stored in a special, read-only, Git-only, compressed and de-duplicated form that Git calls a blob object. This takes care of the obvious objection to storing every file in every commit: in a big repository with thousands of files, we typically change just one file, or a handful of files, and then commit. If every commit contains every file, won't the Git repository balloon into some sort of multi-terabyte monstrosity that nobody can even store?

The answer is: no, it won't, because the file are de-duplicated. If the new commit you just made changes one file out of 5000 files, Git only has to add one file to its internal objects database. Then Git adds one new commit to its database too, and this one new commit refers to 4999 existing blobs, plus one new blob. The 4999 re-used files did not take any space at all!

So, even though every commit refers to ("contains") every file, it doesn't take a lot of space. The "copies" that Git sticks into its index are in this same format: they're indirect references to internal Git blob objects. If you "copy" 5000 files from a commit to Git's index, you copy no data to Git's index, because they're all duplicated. The index still needs a bit of space—on average, a bit under 100 bytes per file in a moderate-size repository that I checked—to record various data about the file, but that's all. The file's content isn't in the index, just the name and other stuff we call metadata: information about the file.

Since these copies are only readable by Git itself, and writable by nobody—not even Git—they're not actually useful yet though. That's why Git copies them to your working tree as well: the working tree copies of each file are actual, normal, everyday files, rather than weird Git-ified internal objects. Every program on your computer can deal with the working-tree files.

So, that's the explanation for the first bullet point: the index holds all the files, in this internal Git-ified read-only "blob object" form, and keeps track of all the files that Git put into the working tree. This is also the source of a key term regarding working tree files: A tracked file, in the working tree, is one that is in Git's index. That's it—that's all there is to "tracked" here—but it turns out to be significant.

Keeping track of the next commit

The middle bullet point above is the one where most Git users most often interact with Git. To make a new commit, you start—you have to start⁴—by checking out some existing commit. That fills in Git's index and your working tree.

The working tree copies of files are there for you to edit and update. You can also create all-new files, or remove existing files. As you update these files, or create new ones, or remove existing ones, you must tell Git about each one.⁵ When you do this, Git updates its index.

The git add command is the main command for doing this. Running git add on some file tells Git: Read the working tree copy of that file, compress it, and check for duplicate data. If there's a duplicate, update the index with the duplicate. If not, use the new compressed data to make a new blob, ready for committing, and update the index with the new blob. Either way, the file is now ready to be committed.

Every other file that's in the index remains there, untouched, also ready to be committed. If you remove a file entirely, you tell Git to remove its index copy (the name and metadata), and now the lack-of-file is ready to be committed.

Everything you do, it turns out, is in service of updating the index. Git doesn't build the new commit from what's in your working tree. Git builds the new commit from what's in Git's index. The index is your proposed next commit. Every time you add something to it—an updated file or a new file—or remove a file from being tracked by removing it from the index, you've updated your proposed next commit.

The next actual commit you make—whenever you make it—will "freeze" the index copies of the files into a new commit. These copies will then be available forever, or at least, as long as that new commit continues to exist.⁶ The new commit then becomes the current commit; the index and the commit now match, and we're in a similar situation to when we first checked out some commit.⁷

⁴Except, that is, for the very first commit in a new, totally empty repository, or for the special case of git checkout --orphan or git switch --orphan. But we won't address the special cases here.

⁵Why doesn't Git find this out on its own? Well, it turns out that in new, ongoing-experimental code, Git does, but this is very hard to do—at least reliably—on many computers today. There are simplicity advantages, and other advantages, to not doing that. So Git didn't, initially, and now some people even depend on it.

In the old days, a lot of version control systems did find out on their own, at the time you ran their "commit" verb, whatever they called it. This could take a long time, so you'd run that command and then go take a coffee break or whatever. Git's near-instant commits were, when they first came out, astonishing.

⁶You can use git reset to hide away some commit or commits, so that it/they cannot be found. They do not immediately cease to exist, but eventually, if they are hidden long enough, Git decides you don't want those commits after all, and cleans them out.

⁷Since the working tree plays no part in the git commit process, we're allows to only git add some files, then run git commit, and then git add more files and git commit again. Each new commit picks up the changes in the added files, while leaving the un-added files un-added. This is one of the things people depend on, as noted in footnote 5. Since Git exposes the index like this, and documents how this works, you're allowed to depend on it.

Keeping track of merge conflicts

The last special case for the index comes into play only when using Git commands that invoke Git's merge engine. The merge engine combines three existing commits to produce one new commit. To do this, it reads up to three copies of each file into an expanded version of the index.

The merge engine then takes the three (or fewer), "numbered slot" entries, 1 through 3, for each file and combines them. If the merge engine is able to combine them correctly—or what it thinks is correct, anyway—it immediately shrinks that index entry down to a single normal, "slot-zero", unconflicted entry for the file:

If the merge process can do this for every slot for every file, the merge is complete and Git can go on.
If not, the merge stops in the middle, leaving the mess in the index.

Your job, as a programmer using Git to get work done, is then to resolve the mess in the index. Git leaves behind its best-effort at merging in your working tree as well, and you can use both the index information and the working tree copy to do your job.

This particular part just gets more complicated from here (e.g., whether you want to use git mergetool), and this answer is long enough already, so we'll leave most of the details out. But I will say that this produces a case that I consider a sort of flaw in Git. When the index is in this "merge conflict" state, you cannot write it out. You cannot make a new commit. In general, you can't proceed from here until you resolve the merge conflict. You have just the two options: finish the merge, or abort it entirely. Since Git is meant for doing distributed work, it really ought to allow distributed merging as well, with the ability to make special "conflict commits" or whatever they might be called. But you can't.

As I said, I can't find this `8cc8573b39c9efde42a77701b91b3b6dcbb6b7ac` *anywhere*. Not in both superprojects, not in both submodules of these superprojects. Where does it come from? — Alexander Dyagilev, Aug 30 '21 at 10:31
It's in a gitlink in the superproject, and has presumably been copied into Git's index by now. Use `git ls-tree -r HEAD` or `git rev-parse :path/to/submodule` to see it in the current commit or the index. Because gitlinks aren't *files*, they are normally invisible, but `git ls-tree` and `git rev-parse` are included in the various tools that can make them visible. — torek, Aug 30 '21 at 10:33
As for who put it *into* a commit in the superproject: the only way to answer that is to look at the author of the various commits that have that gitlink in them. — torek, Aug 30 '21 at 10:34
Both commands does not show this commit. Today I did commit to my original project and its vmscl submodule. Then I've tried to pull to the fork again. And it fails with the same result and the same `8cc8573b39c9efde42a77701b91b3b6dcbb6b7ac` hash. — Alexander Dyagilev, Aug 30 '21 at 10:39
Who put it into a commit...? Hmm.. As I said, such commit does not exist anywhere. Git log of these four repositories does not show it. — Alexander Dyagilev, Aug 30 '21 at 10:40
`git log` *won't* show it; the submodule gitlink hash ID is *hidden*. But it's there, in some commit. It might be in the commit that your `git pull` is trying to merge to, rather than in the `HEAD` commit. Try `git ls-tree -r @{upstream}` or `git rev-parse @{upstream}:path/to/submodule`. — torek, Aug 30 '21 at 10:42
It's definitely in *some* commit. What *did* you get from those commands? Are these Git repositories available for the general public to view? Hm, something else occurs to me: you probably have recursion turned on, so it could be in a submodule of a submodule of a submodule (to some depth). — torek, Aug 30 '21 at 10:47
`git rev-parse @{upstream}:vmscl` gives `dd000bcf027f8c2f4ee2d9dd8b0be78c188a72d4`. There are no submodules inside of vmscl. Unfortunately, these Git repositories are not available for the public view... — Alexander Dyagilev, Aug 30 '21 at 11:01
Hm, and `git ls-tree -r` on the various commits shows gitlinks with `mode 160000` and appropriate hash IDs? This hash ID has to be coming from *some* commit, or the index, as those are the only places that gitlinks are ever stored. The only other possibility is indeed some kind of corruption, but Git keeps sha1 checksums on all its internal data, so you should get a completely different error. If there's a hardware problem, you could theoretically have a single bit error that only shows up here, but then the hash IDs would resemble each other. — torek, Aug 30 '21 at 11:09

score 0 · Accepted Answer · answered Aug 30 '21 at 18:51

It was a corruption of the local repository. I had the same local repository on my old laptop (I copied it to a new laptop previously via local Wi-Fi connection). Today I turned this old laptop on and was able to successfully pull all the modifications. It seems that the copy on my new laptop was damaged somehow while it was copied via Wi-Fi connection.