It's almost, but not quite, that symmetric.
It's true that git add file
copies the file to the stage (aka "index"). However, the way it does so is a bit weird.
Inside a git repo, everything is stored as a git "object". Each object has a unique name, its SHA-1 (those 40-character strings like 753be37fca1ed9b0f9267273b82881f8765d6b23
—that's from an actual .gitignore
I have here). The name is constructed by computing the hash on the file's contents (more or less—there's some gimmicking to make sure you don't make a file out of a directory tree or commit, and cause a hash collision, for instance). Git assumes that no matter the contents, the SHA-1 will be unique: no two different files, trees, commits, or annotated-tags will ever hash to the same value.
Files (and symbolic links) are objects of type "blob". So a file that's in the git repo is hashed, and somewhere, git has a mapping: "file named .gitignore
" to "hash value 753be37fca1ed9b0f9267273b82881f8765d6b23
").
In the repo, directory trees are stored as objects of type "tree". A tree object contains a list of names (like .gitignore
), modes, object types (another tree or a blob), and SHA-1s:
$ git cat-file -p HEAD:
100644 blob 753be37fca1ed9b0f9267273b82881f8765d6b23 .gitignore
[snip]
A commit object gets you (or git) a tree object, which eventually gets you the blob IDs.
The staging area ("index"), on the other hand, is simply a file, .git/index
. This file contains1 the name (in a funny slightly-compressed form that flattens out directory trees), the "stage number" in the case of merge conflicts, and the SHA-1. The actual file contents are, again, a blob in the git repo. (Git does not store directories in the index: the index only has actual files, using that flattened format.)
So, when you do:
git add file_name
git does this (more or less, and I'm deliberately glossing over filters):
- Compute the hash for the contents of file
file_name
(git hash-object -t blob
).
- If that object is not already in the repo, write it into the repo (using the
-w
option to hash-object
).
- Update
.git/index
(or $GIT_INDEX_FILE
) so that it has the mapping under the name file_name
, to the name that came out of git hash-object
. This is always a "stage 0" entry (which is the normal, no-merge-conflict version).
Thus, the file isn't really "in" the staging area, it's really "in" the repo itself! What's in the staging area is the name to SHA-1 mapping.
By contrast, git checkout [<tree-ish>] -- file_name
does this:
If given a <tree-ish>
(commit name, tree-object ID, etc—basically anything git can resolve to a tree), do the name lookup from the tree found by converting the argument to a tree object. Using the object ID thus located, update the hash in the index, as stage 0. (If file_name
names a tree object, git recursively handles all the files in the directory the tree represents.) By creating stage 0 entries, any merge conflicts on file_name
are now resolved.
Otherwise, do the name lookup in the index (not sure what happens if file_name
is a directory, probably git reads the working directory). Convert the file_name
to an object ID (which will be a blob by this point). If there is no stage-0 entry, error out with the "unmerged" message, unless given -m
, --ours
, --theirs
options. Using -m
will "un-merge" the file (remove the stage 0 entry and re-create the conflicted merge2), while --ours
and --theirs
leave any stage 0 entry in place (a resolved conflict stays resolved).
In any case, if this has not yet errored-out, use the blob SHA-1(s) thus located to extract the repo copy (or copies, if file_name
names a directory) into the working directory.
So, the short version is "yes and no": git checkout
sometimes modifies the index, and sometimes only uses it. However, the file itself is never stored in the index, only in the repo. If you git add
a file, change it some more, and git add
it again, this leaves behind what git fsck will find as a "dangling blob": an object with no reference.
1I'm deliberately omitting a lot of other stuff in the index that is there to make git perform well, and allow --assume-unchanged
etc. (These are not relevant to the add/checkout action here.)
2This re-creation respects any change to merge.conflictstyle
, so if you decide you like diff3
output and already have a conflicted merge without the diff3
style, you can change the git config and use git checkout -m
to get a new working-directory merge with the new style.