I'm quite puzzled at how this came to be.
Git stores names—branch names, tag names, remote-tracking names, and the like—in a key-value database where the full name, e.g., refs/heads/main
, is the key, and the value is a hash ID. OK, but what does this have to do with anything? Well, the actual implementation of this key-value database is pretty sleazy, at the moment:
- sometimes the name-value pair consist of a line (perhaps with an auxiliary line added) in a file named
.git/packed-refs
;
- sometimes the value is stored in a plain-text file (no auxiliary line added) that is stored in a directory within
.git/refs
, with the directory name made up of the components of the reference (with the redundant refs/
elided).
Sometimes it's even in both locations, although this means that the "unpacked" file ref overrides the packed one as it's presumed to be newer.
So this means that if a branch named feature/foo
exists, .git/refs/heads/feature/
might exist. Or it might not!
OK, but so what? Well, if that directory does exist and you run a Python program and Python loads a file named app.py
and byte-compiles it to a .pyc
file (and you're using Python3), the python byte-compiler might write the .pyc
file to a file named __pycache__/app.cpython-310.pyc
.1 Python will then use this byte-compiled file to load and run things.
But once Python has done that, Git will think that .git/refs/heads/feature/__pycache__/app.cpython-310.pyc
is a valid ref and therefore there is a branch named feature/__pycache__/app.cpython-310.pyc
. Unfortunately, its hash ID is garbage.
The same sort of thing can happen with a remote-tracking name: the only difference is that the directory in question will begin with .git/refs/remotes/origin/feature/
. In both cases, Git thinks the name is valid, but the value—the hash ID—is bogus. This is what causes the failure to "lock" the "broken" reference.2
The real question, then, is: What caused a running Python program to drop a __pycache__
file into a subdirectory of your Git repository? The location of the __pycache__
directory is supposed to be the same as the location of the loaded .py
file, which would suggest that something wrote an app.py
file inside this internal Git repository .git/refs/
directory, which would be bad (programs should not do that sort of thing: they should generate their temporary files in private temporary directories or /tmp
or similar).
Other than pointing to What is __pycache__? and If Python is interpreted, what are .pyc files? I must leave this mystery unsolved at this point, since I have no idea what created this app.py
file in the first place.
1The compiled file may end with .pyo
if byte-code-optimization is turned on. Python2 writes these files next to the .py
files, rather than in a __pycache__
directory. The cpython-310
part in the middle means you're using CPython version 3.10; this version number insertion doesn't occur in some (older) versions of Python.
2The act of "locking" a ref in this case consists of creating a file whose name ends in .lock
and is otherwise the same as the full ref file name. This file will be used to hold the new value, and an atomic rename
operation will be used to swap the file into place to update and unlock the ref. All of this depends heavily on POSIX file semantics, which is one of the numerous reasons one should not put a .git
repository into cloud-managed software folders, which don't obey POSIX file semantics.