The change is not removed from the staging area. The entire file is removed from the staging area.
git rm --cached file1.txt
rm 'file1.txt'
++ git status
On branch master
Changes to be committed:
(use "git reset HEAD <file>..." to unstage)
deleted: file1.txt
Note that this is showing up as to be committed. That means the file is in the HEAD
commit (see the last section on git status
).
Long
The way to think about this is:
- Git commits store whole files, always. They do not store changes.
- Each commit has its own independent set of files, quite apart from every other commit. (However, since the files in a commit are completely read-only, frozen for all time, any commit can share files with any other commit, if the content of those files match. The fact that you can never change any commit, not one single bit, enables this.)
- The files that go into your next commit are the files that are currently stored in the index.
The index is so important—and/or so poorly named—that Git actually has three names for it. Sometimes Git calls it the index. Sometimes Git calls it the staging area. Occasionally—rarely these days—Git calls it the cache. These different names reflect the different ways that this thing—this index/staging-area/cache—is used, but for the most part, it's all just the one thing.
Despite its importance, though, Git rarely lets you see what is in it—at least, not directly. You can easily see what is in your work tree (or working tree or any number of similar terms—again these all refer to the same thing), because your work-tree—I like to hyphenate it—holds ordinary files in their everyday format, so that every program on your computer can see them and work with them. This is not the case for files that are in commits, nor for files that are in the index.
Normally, when Git shows you a commit, it shows it by comparing the commit to some other commit. The most common comparison is between a child commit and its immediate parent. When you have a pretty-new repo with just two commits in it, one is the parent and the other is the child, and git show
shows you what's in the child by:
- extracting all the files from the parent into a temporary work area;1
- extracting all the files from the child into a temporary work area; and
- comparing all the files in these two work areas.
It then merely tells you about files that are different, and by default, shows you what it sees as the difference as well.
The files that are in commits are in a special, read-only, frozen, Git-only format that Git calls a blob object. You don't really need to know this (it won't be on any quiz ) to use Git. But it helps, because you do need to know about the index, to use Git. The files stored in Git's index are in this same read-only, Git-only format.2 This means that you literally can't see them—at least, not without having Git extract them somewhere.
When you git checkout
a commit, Git copies that commit's files into the index (but see footnote 2 for technical strictness again). Then it copies—and de-Git-ifies—the frozen-format file into your work-tree, so that you can see it and work with it.
You can now work with the work-tree files. If you change one in any way—whether that's a total replacement, or a modification in place—this has no effect on the index. You probably want the changed file in your new commit, though, so now you should run git add
on that file. What git add
does is package up the work-tree copy of the file into the internal Git-only format, and write that into the index (and see footnote 2 again for technical accuracy).
When you make a new commit, Git packages up the index's files as a new commit. So now the new commit and the index match. The new commit becomes the current commit. If you updated the index as you went along, all three storage areas match: the current commit, the index, and your work-tree.
If you like, you can remove a file from the index. You can do this while also removing it from your work-tree, or while keeping it in your work-tree. Either way, what you've done is arrange for the next commit you make to just not have the file at all.
1This temporary work area is not your work-tree, which is mostly reserved for you to mess with. In fact, given the way commits are stored internally, Git can usually get away with not bothering to extract very much at all: it's easy for Git to tell that file F
in commit P
is exactly the same as file F
in commit C
, for instance, so for all unchanged files, Git can just do nothing at all.
2Technically, the index simply holds the file's name and a reference to the internal blob object that Git is using to store the file's content. But you can use Git without knowing this: it's OK to imagine the index holding the entire file's content, at least until you start getting deep into Git internals and using git ls-files --stage
and git update-index
directly.
Summary of the above
The short version of all of the above is that the index acts as where you build your next commit. It has a copy of every file—or more precisely, a reference to such a copy—in the form that the file would or does have in a new or an existing commit.
When you run git commit
, Git packages up the index into a new commit. The new commit becomes the current commit as soon as possible after the new commit has been created.3 So, now the index and the commit match. That's also the normal case right after git checkout
: the index and commit normally match. You make them not-match using git add
and/or git rm
. Then you make a new commit from the index, and they match again. The index starts out as a copy of the current commit. Then you change it—put entire new files in, or take entire files out—to build up your proposed new commit. Then you commit and they match.4 All of this happens mostly-invisibly, because the only files you can see and work with are the ones in your work-tree.
3This is so fast that it's almost impossible not to see it as a single operation. But it is actually separate operations: "write out commit", then "update some reference". The reference update requires adding to the reference's reflog, in most cases, and that's where you could—at least in theory, if you're fast enough—see these various steps unfold.
4There are some exceptions to this rule. See, e.g., Checkout another branch when there are uncommitted changes on the current branch. Eventually, look into git commit --only
too. But it's at least relatively dependable.
Viewing the index with git status
Remember that the index (or staging area, if you prefer that name) sits, in effect, between your current commit—which Git calls HEAD
—and your work-tree. That is, you can draw the current commit on the left, the index in the middle, and your work-tree on the right:
HEAD index work-tree
--------- --------- ---------
README.md README.md READNE.md
file.txt file.txt file.txt
The HEAD
copy is read-only. You can copy from it, to the index and/or the work-tree, but you can't copy to it. The index copy can be replaced wholesale (git add
) or removed entirely (git rm
). The work-tree copy is a regular file, so you can do anything that your computer can do, without even using Git at all.
You can't see the index copy of the file directly, but git status
will do comparisons and tell you what's different. In fact, git status
runs two comparisons:
First, it compares HEAD
vs the index. For every file that is the same, it says nothing at all. For a file that is different, it reports something staged for commit.
Then, it compares the index vs your work-tree. For every file that is the same, it says nothing at all. For a file that is different, it reports something not staged for commit.
This tells you, in a very efficient way, what's in your index: i.e., what will be in the next commit. If it's different from what's in the current commit, you see a change staged for commit. If it's different from what's in your work-tree, you see a change not staged for commit.
There's one last wrinkle here. Because your work-tree is yours, to do whatever you want with it, you can put files into it that aren't in the index. Or, you can take a file that's in all three places—HEAD
, the index, and your work-tree—and remove it from the index, without removing it elsewhere. You can't remove it from the commit—no commit can ever be changed—so it remains there, but it can also remain in the work-tree, and/or you can change the file in the work-tree.
Any file that is not in the index, but is in your work-tree, is what Git calls an untracked file. This is the actual definition of untracked file: it's just a file that exists in your work-tree but not in the index.
Because you can change the index (put files in, or git rm --cached
to take them out), you can change the untracked-ness of any file at any time. Untracked-ness is always relative to what's in the index.
In any case, though, when you do have untracked files, git status
normally complains about them. To shut it up—make it not complain that all your build artifacts are untracked, for instance—you can list file names, or glob patterns, in .gitignore
files. These entries in .gitignore
do not make files untracked. They just tell git status
to shut up about them, and tell git add
not to add them to the index by default. If a file that would match a .gitignore
line is already tracked, though, it stays tracked.