2

I engaged in the following git session noted below. As you see, when I returned to the HEAD of the master branch, git reported that the file y.y was deleted. This was confirmed with ls. But then when I checked out earlier commits and returned to master the file reappeared.

I'm a total beginner with git but I cannot for the life of me understand why git checkout master would give two different results when separated only by read-only commands.

rwilson@855:~/ht$ git log --oneline
f065234 add y.y
04a7340 2nd commit
b6ca522 init commit
rwilson@855:~/ht$ git checkout master
D   y.y
Switched to branch 'master'
rwilson@855:~/ht$ ls
x.x  x.y
rwilson@855:~/ht$ git checkout HEAD~1
Note: checking out 'HEAD~1'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b new_branch_name

HEAD is now at 04a7340... 2nd commit
rwilson@855:~/ht$ ls
x.x  x.y
rwilson@855:~/ht$ git checkout HEAD~1
Previous HEAD position was 04a7340... 2nd commit
HEAD is now at b6ca522... init commit
rwilson@855:~/ht$ ls
x.x
rwilson@855:~/ht$ git checkout master
Previous HEAD position was b6ca522... init commit
Switched to branch 'master'
rwilson@855:~/ht$ ls
x.x  x.y  y.y
Fixee
  • 1,581
  • 2
  • 15
  • 25

2 Answers2

2

I'm guessing a bit, but initially I believe you were in "detached HEAD" mode on commit f065234, which is the commit to which the branch name master points. Thus git log --oneline showed you that commit and its two ancestors.

For whatever reason, while on that commit, you (or something you did) deleted file y.y.

First, let's get into that state.

$ cd /tmp; mkdir repo; cd repo
$ git init
Initialized empty Git repository in /tmp/repo/.git/
$ echo 'first file x.x' > x.x
$ git add x.x
$ git commit -m 'init commit'
[master (root-commit) 63ddf00] init commit
 1 file changed, 1 insertion(+)
 create mode 100644 x.x
$ echo 'second file x.y' > x.y
$ git add x.y
$ git commit -m '2nd commit'
[master 62cb693] 2nd commit
 1 file changed, 1 insertion(+)
 create mode 100644 x.y
$ echo 'third file y.y' > y.y
$ git add y.y
$ git commit -m 'add y.y'
[master b4c61d1] add y.y
 1 file changed, 1 insertion(+)
 create mode 100644 y.y
$ 

Now we just need to "detach HEAD":

$ git checkout --detach master
Note: checking out 'master'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b new_branch_name

HEAD is now at b4c61d1... add y.y
$ 

Next, we remove y.y, and ask git to switch to branch master. We'll see the D state for file y.y, just as you did:

$ rm y.y
$ git checkout master
D   y.y
Switched to branch 'master'
$ 

The reason I did the "detach HEAD" step above was that if I had not, I would get slightly different output. I can show you that now by just repeating the git checkout master command:

$ git checkout master
D   y.y
Already on 'master'
$ 

In both cases, though, the thing here is that git is already at the commit you're asking for. It therefore does not have to touch the work directory at all—so it doesn't.

But something else did! In fact, it was my rm y.y command. I removed the file. Git can see that the file is missing, so it announces this with the D line.

Next, you asked git to check out the previous commit (HEAD~1 aka master~1, which is commit 04a7340 in your repository—mine is 62cb693, as seen in the git commit output above, because I am not you, my files probably contain different data, and so on: my repository is different which makes all my SHA-1s different).

To switch to that commit, git must remove the file y.y: that file is in commit f065234 and not in commit 04a7340. The file is already missing, which makes git's job really easy: it "removes" the non-existent file by doing nothing.

From this point on, if you ask git to check out commit master—i.e., commit f065234—git will have to put file y.y into your work directory. You did this in a few extra steps (checking out b6ca522 first, which also lacks y.y, and only then checking out f065234) but you got there, which means git re-extracted y.y from the underlying repository, placing it in your work directory.


Note: be careful here, git checkout has a different usage pattern that easily can erase work. This mode is spelled git checkout commit -- path in the documentation, e.g., git checkout HEAD -- x.x. Unfortunately, it is (in my opinion) too easy to accidentally invoke this other usage pattern. The next section is only about git checkout commit, not git checkout commit -- path. I'll insert a brief section about the path version after this.


As a general rule, whatever commit SHA-1 you're on—whether detached, or via a branch name—what git does when you ask it to check out another commit is to see what files need to be added, removed, or modified in the work directory, by comparing the two commits. (If the two commits are the same, the result is simple: no files need changing. Please think about this carefully with respect to your first comment below: if you're already on 1234567 and you git checkout 1234567, no files need changing, so git checkout simply does git status.)

Any working-directory file that does not need any changes, git leaves alone.

For any file that does need to be changed in some way, git checks to see if you might lose work. Some obvious ways to lose work include these (there are more but I'll just list these):

  • git must remove the file, but you've edited it (it doesn't match the current commit, much less the target commit, which says to remove it).
  • git must replace the file with a different version, but you've edited it.
  • git must create the file (it's in the target commit but not the current commit), but you created it yourself, with different contents than git will put in it.

In these cases, git checkout will stop with an error, unless you give it the --force flag.

However, in many cases, many working-directory files need no changes. For instance, in the above, I can switch from master to master~1 or master~2 with no changes to file x.x: it's the same in all three revisions.

This means git will allow me to modify x.x, or even remove it entirely, and still check out some other commit: switching from master to master~1 and/or back does not require making any changes to x.x. If I do these checkouts, git prints a one-line status message for any such file, with the letter D indicating that it's missing from the work tree, or M indicating that it's modified in the work tree (with respect to the now-checked-out commit or branch).

(Whether or not this was originally by design, people have grown to like it. It lets you start changing a bunch of files, then realize that you should have been working on another branch. So you git checkout the other branch. As long as none of your changes need to be destroyed to switch, git switches, keeping your changes; so now those changes are ready to go in the other branch.)


All of the above is discussing how git checkout behaves when you ask it to check out a particular commit ("detached HEAD" mode) or branch ("non-detached"). In this mode you invoke git checkout with only one additional parameter, which should be a branch name or other commit specifier: git checkout HEAD^, git checkout master, and so on.

For whatever reasons, though, git also uses the git checkout command to extract specific files from within a commit (writing the extraction through the index/staging-area). To invoke this mode, you are supposed to give git checkout two or more parameters separated with a literal double dash --:

git checkout HEAD -- x.x x.y

This tells git checkout that the thing on the left of the -- is a commit specifier—i.e., where to look in the repository—and the thing(s) on the right are file paths, and it should throw away any work you have in progress on the paths named on the right, replacing them with the versions extracted from the commit on the left.

In other words, if you've started editing x.x and decided you want to "un-edit it", to make it look like it did in the HEAD commit, you check out the HEAD version of x.x into file x.x. Or, if you want to get the previous (HEAD^ or HEAD~1) version of x.x into the work directory and staged-for-commit, you can git checkout HEAD^ -- x.x.

This form is obviously a bit more dangerous than git checkout commit (which tries to avoid clobbering your work). Unfortunately, git checkout does not require the -- part. Furthermore, the commit specifier defaults to HEAD if you leave it out. So:

git checkout -- x.x

also means "please clobber the changes I made to file x.x, and so does:

git checkout x.x

since x.x is not the name of a commit or branch. Likewise:

git checkout .

names a path, but this time the path is "current directory", so this clobbers all work on all files that are in the current directory, or anywhere underneath it, recursively!

(It might be nice if git used a different command for "check out specific files" vs "check out given commit-ID", making it harder to accidentally clobber work.)


You did not ask, but I'll include this here as well: the difference between "detached HEAD" and "on a branch" is actually ridiculously simple. Peek in the .git directory; you'll find a file named HEAD. When you're on a branch such as master, the contents of HEAD are ref: refs/heads/master. When you're detached, the contents are a raw SHA-1 like b4c61d1692f750607a821aa53788e3a7ce5d1199 (git normally shows you an abbreviated version). This is it in a nutshell: when you're on the branch, git gets the raw SHA-1 from the branch name, with the branch name in the HEAD file; when you're detached, git stores the raw SHA-1 directly in the HEAD file.

There's one other key difference, which applies when you create new commits. If HEAD has a raw SHA-1, git writes the new commit's SHA-1 right into HEAD. If HEAD has ref: refs/heads/branch instead, git writes the new commit's SHA-1 to a branch-mapping file instead, and the file HEAD remains unchanged. For efficiency, the branch-mapping file varies sometimes, although currently new commits always create or update a file named .git/refs/heads/branch.

torek
  • 448,244
  • 59
  • 642
  • 775
  • Incredibly helpful, but my experience differs from some of what you say. For example, if I add a line to `x.x` and commit, then add another line to `x.x` and commit again, then add a third line (unstaged), I do not get warnings, but I do get bizarre behavior: `checkout` of the last commit **reapplies** my local changes (!?!?), `checkout` of the commit before this gives an error (why?). And `checkout .` silently deletes my local unstaged changes without warning and with no way to recover. I'm finding `git` to be incredibly counterintuitive. – Fixee Jun 16 '14 at 18:37
  • You've modified the file, not staged it, and you ask to check out the current commit. What's the difference between "current commit" and "current commit"? As for `git checkout .`, that's unfortunately entirely different (I sometimes wish git used different *commands* for this completely-different behavior): it's asking git to force-overwrite directory `.` from the named commit (I'll edit a bit in on that). – torek Jun 16 '14 at 19:53
1

That would be because:

  • when you checked out HEAD~1, y.y was there.
  • but when you checked out master HEAD again, y.y is now considered as untracked file in HEAD:
    • since y.y was in the working tree (because of HEAD~)
    • and since master HEAD doesn't have to touch y.y when updating said working tree

That means y.y remains there on the second checkout of HEAD.

A git clean after the second git checkout master could help.

Community
  • 1
  • 1
VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250