638

Can someone tell me the difference between HEAD, working tree and index, in Git?

From what I understand, they are all names for different branches. Is my assumption correct?

I found this:

A single git repository can track an arbitrary number of branches, but your working tree is associated with just one of them (the "current" or "checked out" branch), and HEAD points to that branch.

Does this mean that HEAD and working tree are always the same?

mkrieger1
  • 19,194
  • 5
  • 54
  • 65
Joyce Babu
  • 19,602
  • 13
  • 62
  • 97
  • 38
    With respect to your edit: absolutely not. `HEAD` is the commit at the tip of the current branch. If you've just checked out the branch, i.e. have no modified files, then its content matches the working tree. As soon as you modify anything, it no longer matches. – Cascabel Sep 11 '10 at 13:17
  • 11
    I think you have to read this: http://think-like-a-git.net/ – Andrzej Duś Apr 28 '14 at 12:27
  • 9
    I would also add a `Staging Area` to that list. What is `HEAD`, `Working Tree`, `Index` and a *`Staging Area`* – Green Sep 28 '16 at 14:31
  • 7
    The last sentence of @Jefromi's would be more clear as: > As soon as you modify anything, the working tree no longer matches the HEAD commit – starscream_disco_party Oct 08 '16 at 17:41
  • 4
    For any reading this in future the best way to truly understand some of these answers is to see and feel and visually conceptualize what is going on: this is the best tool for learning git ever: http://onlywei.github.io/explain-git-with-d3/#fetchrebase – BenKoshy Jul 08 '17 at 10:25
  • 4
    @Green: Staging Area and Index are the same thing. (See approved answer below) – Droopycom May 03 '18 at 04:31
  • @BKSpurgeon None of the links do anything. –  Jun 29 '18 at 13:11
  • @DrEval try hitting the refresh button and the graphics should load. – BenKoshy Jun 30 '18 at 05:42
  • 1
    HEAD is usually (when not detached) a pointer to the most recent commit on a branch rather than a commit itself . But also worth adding that a branch itself is a pointer to a commit just like HEAD is. The pointer being just an identifier that is the commit ID. – Epirocks Aug 28 '21 at 22:06

5 Answers5

740

A few other good references on those topics:

workflow

I use the index as a checkpoint.

When I'm about to make a change that might go awry — when I want to explore some direction that I'm not sure if I can follow through on or even whether it's a good idea, such as a conceptually demanding refactoring or changing a representation type — I checkpoint my work into the index.

If this is the first change I've made since my last commit, then I can use the local repository as a checkpoint, but often I've got one conceptual change that I'm implementing as a set of little steps.
I want to checkpoint after each step, but save the commit until I've gotten back to working, tested code.

Notes:

  1. the workspace is the directory tree of (source) files that you see and edit.

  2. The index is a single, large, binary file in <baseOfRepo>/.git/index, which lists all files in the current branch, their sha1 checksums, time stamps and the file name -- it is not another directory with a copy of files in it.

  3. The local repository is a hidden directory (.git) including an objects directory containing all versions of every file in the repo (local branches and copies of remote branches) as a compressed "blob" file.

Don't think of the four 'disks' represented in the image above as separate copies of the repo files.

3 states

They are basically named references for Git commits. There are two major types of refs: tags and heads.

  • Tags are fixed references that mark a specific point in history, for example v2.6.29.
  • On the contrary, heads are always moved to reflect the current position of project development.

commits

(note: as commented by Timo Huovinen, those arrows are not what the commits point to, it's the workflow order, basically showing arrows as 1 -> 2 -> 3 -> 4 where 1 is the first commit and 4 is the last)

Now we know what is happening in the project.
But to know what is happening right here, right now there is a special reference called HEAD. It serves two major purposes:

  • it tells Git which commit to take files from when you checkout, and
  • it tells Git where to put new commits when you commit.

When you run git checkout ref it points HEAD to the ref you’ve designated and extracts files from it. When you run git commit it creates a new commit object, which becomes a child of current HEAD. Normally HEAD points to one of the heads, so everything works out just fine.

checkout

VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
  • 28
    After reading about git lot many times I never ever understand it completely I got really frustrated n I wanna use the f word; But im in community! U've mentioned heads but in the images above there is always a single HEAD where r the remaining f**ng heads? "Normally HEAD points to one of the heads, so everything works out just fine." I beg u to explain this, Ur statement. – Necktwi Apr 27 '14 at 05:48
  • 16
    @neckTwi HEAD is the **current commit** you are working with (http://stackoverflow.com/a/964927/6309). It usually is one of the "branch heads" (one of the commits referenced by branches, representing the *tip* of said branches). But you can checkout (and work on) any commit. If you checkout a commit which isn't one of the (branch) heads, you are in a "detached HEAD" mode: http://stackoverflow.com/a/3965714/6309 – VonC Apr 27 '14 at 06:34
  • Aren't those arrows backwards? Shouldn't each commit point back to the commit before it? I think that would be more correct – CodyBugstein Jan 24 '15 at 17:52
  • 1
    @Imray I agree, but that is how I found those pictures 5 years ago (http://hades.name/blog/2010/01/28/git-your-friend-not-foe-vol-3-refs-and-index/) – VonC Jan 24 '15 at 18:13
  • those arrows are not what the commits point to, it's the workflow order, basically showing arrows as `1 -> 2 -> 3 -> 4` where 1 is the first commit and 4 is the last – Timo Huovinen Sep 07 '15 at 10:52
  • @TimoHuovinen Very true. I have included your comment in the answer for more visibility. – VonC Sep 07 '15 at 10:59
  • 16
    Regarding the index, I think the most useful thing that can be said is "The index is just another name for the staging area," like @ashraf-alam said. I feel like *most of the time* in discussion it's referred to as the staging area, which is why I didn't automatically make the connection that it was the same thing as the index. – Pete Feb 08 '16 at 18:47
  • 1
    @Pete I agree. For more on the difference between cache and index, see my other answer http://stackoverflow.com/a/6718135/6309 – VonC Feb 08 '16 at 19:20
  • For me, thinking of the index as the staging area is troublesome. The manpages for `git rm` state that it is used to "remove files from the working tree and from the index". In that case, the index is not what is prepared to be commited, but instead it's like an index of all the files present in the repository. What do you think? – Samir Aguiar Sep 10 '16 at 01:12
  • Nice answer, however some of the links are dead; time for a review and refresh. :) – Guy Coder Jul 10 '17 at 11:55
  • @GuyCoder Thank you. I have reviewed and refreshed the links. – VonC Jul 10 '17 at 12:08
  • I think it is also worth adding `git diff --cached` between `local repository` and `index`. – UnchartedWaters Aug 07 '17 at 09:27
  • In the image *Git Data Transport Commands*, the `checkout` and `checkout HEAD` are not working for me. I think they are wrong. Other things are fine. To come back from 'staged' to 'unstaged' state, `git reset HEAD` is used and not `checkout` – Astitva Srivastava Oct 31 '17 at 18:18
  • 1
    @AstitvaSrivastava Yet, for a file, `git checkout HEAD -- afile` and `git reset afile` would be similar, no? Both would restore the file from the local repo, removing anything that was staged: https://stackoverflow.com/a/45018564/6309, https://stackoverflow.com/a/33849726/6309. Whereas a simple checkout (no HEAD) would indeed remove local changes, restoring the index content (meaning restoring what is currently staged in the index) – VonC Nov 01 '17 at 06:15
  • so "workspace", "working directory" and "worktree" are all the same right? – Alexander Mills Aug 05 '18 at 23:26
  • @AlexanderMills yes. Working tree is used in the https://git-scm.com/docs/git-worktree man page. Workspace is a more generic term. – VonC Aug 05 '18 at 23:33
  • `git checkout` and `git checkout HEAD` do the same thing, I don't think that part is the image is correct. Following the link it notes a `.` at the end, to make `git checkout .`, as the correct command. – Nicholas Pipitone Oct 10 '18 at 23:41
  • @NicholasPipitone Yes. I believe the "revert" label in front of those two arrows suggests that git checkout -- . makes more sense indeed. Also https://stackoverflow.com/a/38165714/6309 is instructive. – VonC Oct 11 '18 at 06:23
  • 1
    @Nic Especially: "Remember that Git has this thing called the index or staging area (or sometimes cache): when you git add files, you are copying them into the index, and when you git commit, Git takes a snapshot of the index contents, which becomes the new commit. One of the implications here, which I think not enough Git documentation emphasizes, is that the index always contains every file that will be in the next commit. (And after making a commit, the index is not "empty", as several commands and bits of documentation imply: instead, it is full of everything that you just committed.)" – VonC Oct 11 '18 at 06:23
  • @VonC Why `rebase` is an arrow from the remote repository pointing to the working directory on the first picture? –  Aug 18 '19 at 18:25
  • 1
    @akobbs Because I think `rebase` is to be understood in the context of `git pull --rebase`: a `fetch` (from remote repo) + a `rebase` (on top of the remote tracking branch) – VonC Aug 18 '19 at 18:45
  • 1
    @Timo could you illustrate in a separate question what does not work? – VonC May 18 '21 at 06:38
  • [This](https://stackoverflow.com/questions/3689838/difference-between-head-working-tree-index-in-git/3690796?noredirect=1#comment52738133_3690796) link does not work that you mention above:`(note: as commented by Timo Huovinen, those arrows are not what the commits point to, it's the workflow order` – Timo May 18 '21 at 14:51
  • 1
    @Timo OK, but what "does not work" means in this context? – VonC May 18 '21 at 16:03
  • I see now, the comment link does not direct exactly to the comment, but to the comments section. Then it is up to the reader to find the right comment, as in the case of Timo's comment. – Timo May 18 '21 at 18:20
170

The difference between HEAD (current branch or last committed state on current branch), index (aka. staging area) and working tree (the state of files in checkout) is described in "The Three States" section of the "1.3 Git Basics" chapter of Pro Git book by Scott Chacon (Creative Commons licensed).

Here is the image illustrating it from this chapter:

Local Operations - working directory vs. staging area (index) vs git repository (HEAD)

In the above image "working directory" is the same as "working tree", the "staging area" is an alternate name for git "index", and HEAD points to currently checked out branch, which tip points to last commit in the "git directory (repository)"

Note that git commit -a would stage changes and commit in one step.

salmanulfarzy
  • 1,484
  • 14
  • 18
Jakub Narębski
  • 309,089
  • 65
  • 217
  • 230
  • 2
    "A picture is worth a thousand words". Thanks Jakub.. And thanks for the link. – Joyce Babu Sep 11 '10 at 10:38
  • 6
    Note: `working tree` seems to be preferred to `working directory` nowadays. See https://github.com/git/git/commit/89aef71d0eb5b5e06216c2efbba76cffe17679f7 – VonC Jul 09 '16 at 19:24
  • 5
    This picture is not exactly accurate because the Staging Area is contained in a single file called "index"--and that index file happens to be in the root of the .git directory. So if you define the repo as the .git directory, the staging area is technically inside the repo. The third column would be better labeled "HEAD's Root tree object" to indicate that the checked-out files are coming from a commit object and that committing writes a new tree to a commit object--both commit objects are pointed to by HEAD. – Jazimov Apr 28 '17 at 15:37
  • 1
    @Jazimov You are probably right, but as he wrote, he has taken that picture from the well-known Pro Git book, and he has provided a link. Thus, if the picture could be improved or is even wrong, somebody should tell the authors of that book ... In general, I would be willing to do that, but to be honest, I am still a git beginner and have not yet understood what you said, so I am definitely the wrong person in that case. – Binarus Aug 06 '17 at 10:09
  • @Binarus: The danger in the wholesale reproduction of images like this is that it serves to propagate a "misrepresentation" made by one author/book. I think this is a case of literal versus functional interpretations here: In the literal sense, the index in fact is contained IN the repo if you define the repo as everything under the .git folder. In the functional sense, however, the index helps Git maintain the DAG in the repo and can be thought of a being external to it. – Jazimov Aug 06 '17 at 17:56
  • There's a detailed explanation of the index in Jonathan Waldman's August 2017 Git Internals MSDN article: https://msdn.microsoft.com/en-us/magazine/mt493250.aspx – Jazimov Aug 06 '17 at 17:56
  • @Jazimov You are right, but there eventually was a misunderstanding. I did not want to answer your comment on the *technical* level (I couldn't do so anyway because I am still a git beginner). Instead, I just wanted to say that you (or somebody else) probably should bring this mistake to the book author's attention (in addition to correcting it here). Since the Pro Git book is the first thing anybody who wants to learn git will read, and since this image will continue to be wholesale-reproduced, having the Git Pro authors correct it will serve many, many people in the future. – Binarus Aug 07 '17 at 07:51
  • 1
    @Binarus I think it's really a semantic issue and not a "mistake", per se. The figure seems to indicate that the ".git directory" and the "repo" are synonymous and that the Staging Area is separate. I would like to see a ".git directory" label that spans Staging Area and Repo--but I would also like the Repo label to be changed to "DAG". Those changes might overwhelm a beginner, but they present a more accurate depiction of what's actually going on. Let's hope skeptical readers are led to our discussion here! :) Thanks for your comments and thoughts--you are thinking about things the right way. – Jazimov Aug 08 '17 at 17:25
86

Your working tree is what is actually in the files that you are currently working on.

HEAD is a pointer to the branch or commit that you last checked out, and which will be the parent of a new commit if you make it. For instance, if you're on the master branch, then HEAD will point to master, and when you commit, that new commit will be a descendent of the revision that master pointed to, and master will be updated to point to the new commit.

The index is a staging area where the new commit is prepared. Essentially, the contents of the index are what will go into the new commit (though if you do git commit -a, this will automatically add all changes to files that Git knows about to the index before committing, so it will commit the current contents of your working tree). git add will add or update files from the working tree into your index.

SherylHohman
  • 16,580
  • 17
  • 88
  • 94
Brian Campbell
  • 322,767
  • 57
  • 360
  • 340
  • Thanks a lot for the explanation Brian. So, the working tree contains all the uncommitted changes. If I commit my changes with git commit -a, then at that specific time my Working Tree and Index will be the same. When I push to my central repo, all three will be the same. Am I correct? – Joyce Babu Sep 11 '10 at 05:36
  • 3
    @Vinod Pretty much. You can have files in your working tree that Git doesn't know about, and those won't be committed with `git commit -a` (you need to add them with `git add`), so your working tree may have extra files that your index, your local repo, or your remote repo do not have. – Brian Campbell Sep 11 '10 at 06:01
  • 3
    @Vinod: The working tree and index can become the same without committing (git add updates the index from the working tree, and git checkout updates working tree from index). `HEAD` refers to the most recent commit, so when you commit, you are updating `HEAD` to your new commit, which matches the index. Pushing doesn't have much to do with it - it makes branches in the remote match branches in your local repo. – Cascabel Sep 11 '10 at 13:15
67

Working tree

Your working tree are the files that you are currently working on.

Git index

  • The git "index" is where you place files you want commit to the git repository.

  • The index is also known as cache, directory cache, current directory cache, staging area, staged files.

  • Before you "commit" (checkin) files to the git repository, you need to first place the files in the git "index".

  • The index is not the working directory: you can type a command such as git status, and git will tell you what files in your working directory have been added to the git index (for example, by using the git add filename command).

  • The index is not the git repository: files in the git index are files that git would commit to the git repository if you used the git commit command.

Surjit Samra
  • 4,614
  • 1
  • 26
  • 36
Ashraf Alam
  • 3,500
  • 32
  • 31
  • 1
    Note that Git 2.5 will bring **multiple** working trees (http://stackoverflow.com/a/30185564/6309). +1 – VonC Jun 02 '15 at 17:34
  • 3
    I'm not sure that "The Index Isn't The Working Directory" is 100% correct. It should be "The Index Isn't The Working Directory, but it includes entire working directory + changes you want to be committed next". Proof? go to a git repository, `reset --hard HEAD` to make sure that your index == your working tree. an then: `mkdir history && git checkout-index --prefix history/ -a` The result is a duplication of your entire working tree in your `history/` directory. Ergo git index >= git working directory – Adam Kurkiewicz Jul 21 '15 at 10:39
  • 3
    Index is not the working directory and does not have to include the working directory. Index is just a file within the git repository that stores info what you want to commit. – Boon Jul 27 '15 at 20:08
  • 3
    "The "index" holds a snapshot of the content of the working tree, and it is this snapshot that is taken as the contents of the next commit. Thus after making any changes to the working directory, and before running the commit command, you must use the add command to add any new or modified files to the index" (https://git-scm.com/docs/git-add) – anth Oct 15 '15 at 22:38
  • Is there one index area for all branches to store the content of next commit? I have added a file in index and then moved to other branch lets say master. I saw the same file staged even after moving to another branch and committing the change added the files in that branch. – hsingh Jul 29 '16 at 09:12
  • 3
    @AdamKurkiewicz: the proof fails if you first `echo untracked-data > untracked-file`, before or after the `git reset --HARD` and `git checkout-index` steps. You will find that the *untracked* file is *not* in the `history` directory. You can also modify both index and work-tree independently, although modifying the index without first touching the work-tree is hard (requires using `git update-index --index-info`). – torek Jan 19 '17 at 22:18
  • Good clarification! – Tuhin Paul Feb 19 '20 at 21:19
  • @AdamKurkiewicz What you've said, "Ergo git index >= git working directory", doesn't conflict with "The Index Isn't The Working Directory" at all. The index has to be different from the working directory in order for it to be ">" than it. – Khoa Vo May 09 '21 at 06:54
53

This is an inevitably long yet easy to follow explanation from ProGit book:

Note: For reference you can read Chapter 7.7 of the book, Reset Demystified

Git as a system manages and manipulates three trees in its normal operation:

  • HEAD: Last commit snapshot, next parent
  • Index: Proposed next commit snapshot
  • Working Directory: Sandbox

The HEAD

HEAD is the pointer to the current branch reference, which is in turn a pointer to the last commit made on that branch. That means HEAD will be the parent of the next commit that is created. It’s generally simplest to think of HEAD as the snapshot of your last commit on that branch.

What does it contain?
To see what that snapshot looks like run the following in root directory of your repository:

                                 git ls-tree -r HEAD

it would result in something like this:

                       $ git ls-tree -r HEAD  
                       100644 blob a906cb2a4a904a152... README  
                       100644 blob 8f94139338f9404f2... Rakefile  
                       040000 tree 99f1a6d12cb4b6f19... lib  

The Index

Git populates this index with a list of all the file contents that were last checked out into your working directory and what they looked like when they were originally checked out. You then replace some of those files with new versions of them, and git commit converts that into the tree for a new commit.

What does it contain?
Use git ls-files -s to see what it looks like. You should see something like this:

                 100644 a906cb2a4a904a152e80877d4088654daad0c859 0 README   
                 100644 8f94139338f9404f26296befa88755fc2598c289 0 Rakefile  
                 100644 47c6340d6459e05787f644c2447d2595f5d3a54b 0 lib/simplegit.rb  

The Working Directory

This is where your files reside and where you can try changes out before committing them to your staging area (index) and then into history.

Visualized Sample

Let's see how do these three trees (As the ProGit book refers to them) work together?
Git’s typical workflow is to record snapshots of your project in successively better states, by manipulating these three trees. Take a look at this picture:

enter image description here

To get a good visualized understanding consider this scenario. Say you go into a new directory with a single file in it. Call this v1 of the file. It is indicated in blue. Running git init will create a Git repository with a HEAD reference which points to the unborn master branch

enter image description here

At this point, only the working directory tree has any content. Now we want to commit this file, so we use git add to take content in the working directory and copy it to the index.

enter image description here

Then we run git commit, which takes the contents of the index and saves it as a permanent snapshot, creates a commit object which points to that snapshot, and updates master to point to that commit.

enter image description here

If we run git status, we’ll see no changes, because all three trees are the same.

The beautiful point

git status shows the difference between these trees in the following manner:

  • If the Working Tree is different from index, then git status will show there are some changes not staged for commit
  • If the Working Tree is the same as index, but they are different from HEAD, then git status will show some files under changes to be committed section in its result
  • If the Working Tree is different from the index, and index is different from HEAD, then git status will show some files under changes not staged for commit section and some other files under changes to be committed section in its result.

For the more curious

Note about git reset command
Hopefully, knowing how reset command works will further brighten the reason behind the existence of these three trees.

reset command is your Time Machine in git which can easily take you back in time and bring some old snapshots for you to work on. In this manner, HEAD is the wormhole through which you can travel in time. Let's see how it works with an example from the book:

Consider the following repository which has a single file and 3 commits which are shown in different colours and different version numbers:

enter image description here

The state of trees is like the next picture:

enter image description here

Step 1: Moving HEAD (--soft):

The first thing reset will do is move what HEAD points to. This isn’t the same as changing HEAD itself (which is what checkout does). reset moves the branch that HEAD is pointing to. This means if HEAD is set to the master branch, running git reset 9e5e6a4 will start by making master point to 9e5e6a4. If you call reset with --soft option it will stop here, without changing index and working directory. Our repo will look like this now:
Notice: HEAD~ is the parent of HEAD

enter image description here

Looking a second time at the image, we can see that the command essentially undid the last commit. As the working tree and the index are the same but different from HEAD, git status will now show changes in green ready to be committed.

Step 2: Updating the index (--mixed):

This is the default option of the command

Running reset with --mixed option updates the index with the contents of whatever snapshot HEAD points to currently, leaving Working Directory intact. Doing so, your repository will look like when you had done some work that is not staged and git status will show that as changes not staged for commit in red. This option will also undo the last commit and also unstage all the changes. It's like you made changes but have not called git add command yet. Our repo would look like this now:

enter image description here

Step 3: Updating the Working Directory (--hard)

If you call reset with --hard option it will copy contents of the snapshot HEAD is pointing to into HEAD, index and Working Directory. After executing reset --hard command, it would mean like you got back to a previous point in time and haven't done anything after that at all. see the picture below:

enter image description here

Conclusion

I hope now you have a better understanding of these trees and have a great idea of the power they bring to you by enabling you to change your files in your repository to undo or redo things you have done mistakenly.

Mehdi Ijadnazar
  • 4,532
  • 4
  • 35
  • 35