What exactly does an index refer to when talking about previous commits?

Question

I really don't understand this concept of the index matching HEAD. The index is what's about to be committed, so shouldn't the index always be empty because everything is committed when doing a reset to a previous commit?

Going off of this question, it seems like the git index contains a cached version of everything rather than just the changes, so is this at all relevant?

torek · Accepted Answer · 2015-11-14T22:24:27.840

I think I know what you're confused by. Git does not store diffs.¹

I think you already have seen this, but let me restate it: The index contains the next commit you will make.

When you've just made a commit, you made it using the index,² hence there's no difference between "what's in the index now" and "what's in HEAD". That does not mean the index is empty. In fact, the index is still full of "what's in HEAD". If you run git commit --allow-empty, git commits the same index again, getting a new commit that uses the same tree as the previous commit. If you instead empty out the index,³ then compare it to HEAD, you'll see that you're about to delete every file.

Something else that might help is to realize that when you ask git to compare the index to an existing commit (git diff --cached, git status, and so forth) it conjures up a new diff right then and there. It does the same thing for git diff <commit-1> <commit-2>, and also for git show <commit-id> and git log -p. Git doesn't store diffs, it generates new ones every time.⁴

¹This deliberately ignores "delta compression" in git's "pack files", which do sort of store diffs, but not in the way other other version control systems do it.

²Actually, you can use any number of different index files, and some commands—such as git commit -- <path>, for instance—make a new temporary index rather than using "the" (default) index. But normally the last commit you made used "the" index, rather than some alternative temporary index file.

³To "empty out" the index, you actually have to write "remove file" records to it, because git will re-generate the index if you actually make it an empty file, or remove it. This is because the index actually has a dual role: it's both "the next commit" and "a way to speed up git so that it doesn't have to read a lot of directories and files".

Also, the --allow-empty flag is misleading here: it means "allow the diff to be empty", not "allow the index to be empty".

⁴Linus Torvalds describes this as a feature, because it means that if someone makes git diff smarter, that new smart-ness now applies retroactively to every previous commit. This is a correct observation, and all it costs is a little⁵ compute power.

⁵And by "a little" we mean "a metric ****load" :)

Wow, great answer! Thanks! It's really confusing because when learning from tutorials, it seems like the index gets cleared every time you commit (hence that the staging area = the entire index) — rb612, Nov 14 '15 at 18:49

What exactly does an index refer to when talking about previous commits?

1 Answers1