4

I'll preface this by mentioning that I've been working with Git for years, but my knowledge is limited to very basic workflows. Realizing this, I've been getting a lot better with "advanced" Git features, but here's a question I can't quite figure out:

What exactly does git checkout [file] do?

My understanding was that it retrieves the latest revision/commit of that specific file, discarding any unstaged working copy.

So, I am wondering if this differs in any way from git restore [file], and when I might want to use either of the two.

Sadly, I haven't been able to find a super clear answer elsewhere.

Andys1814
  • 95
  • 1
  • 7
  • 1
    This answer may be helpful: https://stackoverflow.com/a/58003889/83605 – Aziz Jul 02 '20 at 17:04
  • 1
    @Aziz Great note... that helps me understand the context behind restore a little bit better. So they basically achieve the exact same thing, but restore was included as a variant (among others) to reduce common confusion? – Andys1814 Jul 02 '20 at 17:11
  • 1
    @Andys1814 Basically. I could imagine that once the interface provided by `git restore` stabilizes, `git checkout` can be changed to no longer duplicate what has been delegated to `git restore`. – chepner Jul 02 '20 at 18:39
  • Fort Party of the history see https://stackoverflow.com/q/57123031/5784831. Especially vonc s answer... – Christoph Jul 02 '20 at 18:41

3 Answers3

6

git checkout does two things: it switches branches and it restores files to a certain state.

These two behaviours were extracted in Git 2.23 in two new commands: git switch and git restore.

git checkout still does both things and it will probably keep doing them many years from now but the new commands are recommended because they are more clear.

axiac
  • 68,258
  • 9
  • 99
  • 134
6

Between the answers from axiac and Jay you have a pretty complete picture, but I'd like to cover the whole thing a bit differently. There are two git checkout forms of interest here:

  1. git checkout -- path: the path argument is treated as a file or directory—if it's a directory, it means all files within that directory, recursively—and Git copies the current copy of the file as found in the index out to your work-tree.

    We'll say more about the index in a moment.

  2. git checkout HEAD -- path: the path argument is treated the same way, but Git copies the current copy of the file as found in the HEAD commit to both the index and your work-tree.

    You can use a name other than HEAD here: any commit specifier works, and Git copies that commit's copy of the file to both the index and your work-tree.

For git restore, you have more flexibility. You can tell Git to copy a file:

  • from the index, or from any commit (use the --source option to choose a source explicitly, or let the command figure one out on its own)
  • to the index, or to your work-tree, or both (use the --staged option to copy to the index, and/or the --worktree option to copy to the work-tree).

Note that if you use git checkout and choose a commit, the file goes to both places: the index, and your work-tree. Copying "from the index, to the index" does not really do anything (and might net you a complaint since it seems like a weird thing to ask).

About the index and your work-tree

Your work-tree is pretty simple and obvious. Git stores each commit as a full snapshot: a complete copy of every file you've committed. But, files stored inside Git's commits are stored in a special, read-only, compressed, de-duplicated, Git-only format. This keeps the .git directory (the repository proper) from getting too big too fast. Most commits really just re-use most previous files, which means the de-duplication works extremely well. For those times when it doesn't work so well, the compression usually takes care of the rest.1 On the other hand, because the files here are read-only and aren't ordinary files, you can't actually use them to get your work done. Git has to copy these files out, turning then back into ordinary files that you, and your computer, can use.

The useful, everyday-work copies of your files appear in what Git calls your work-tree or working tree. You can do anything you want with these files. Git doesn't actually use them! It does create them, having extracted the compressed Git-only stored files from a commit, but it doesn't make commits from them.

Git's index, which Git also calls the staging area or sometimes—rarely these days—the cache, holds a copy2 of every file that came out of the current commit and is now ready to go into the next commit you will make.3 When you run git commit, Git just packages up whatever is in the index at that time. This is why you keep having to git add a file. If you change a file, you haven't changed Git's copy of the file: you have only changed the work-tree copy. You use git add to tell Git: Copy my updated work-tree file back into your index, replacing the old copy there.

Hence, the index always holds your proposed next commit. You use git add, or if you need to remove files, git rm, to update—or remove—these index copies of your files. They start out with the copies from the commit you've checked out: the current or HEAD commit. So if you haven't used git add, the index and HEAD commit copies normally match.4 When they do match, the presence of the index copy tends to be invisible: git status doesn't mention the index copy, for instance.

If you're curious, though, try running git ls-files --stage (be prepared for a lot of output in a big repository!—this command doesn't run its output through a pager). This will show you every file in the index.


1The compressed, de-duplicated format doesn't work for some very large binary file formats. In this case Git can become bloated. This is where things like Git-LFS come in.

2Technically, the index doesn't hold an actual copy of the file, but rather a reference. The difference only shows up if you start using the low level commands git update-index or git ls-files --stage, which can inspect the index contents directly. For everyday use, though, you can just think of the index as holding a copy of the files. The index copy is in the frozen and de-duplicated format—i.e., pre-frozen and pre-de-duplicated, albeit overwrite-able—which makes committing go fast.

3The index also takes on an expanded role when you have to deal with merge conflicts. This answer does not cover this case.

4The word normally here is to take care of a lot of corner cases, including things like git reset --soft and special varieties of git commit that don't necessary use the regular index.

torek
  • 448,244
  • 59
  • 642
  • 775
3

git checkout file will revert the file(s) or folder(s) specified to the index = their staged state (or current HEAD if no changes staged for commit).

So let's say your file content on current commit (HEAD) is a.

And that you used git add in the past to stage a change from content a to content b.

Then your current git diff --staged shows

diff --git a/file b/file
--- a/file
+++ b/file
@@ -1 +1 @@
-a
+b

Additionally further changes were done to the file, but not added yet.

Running git checkout -- file will revert the file to content b and throw away its current content. Meaning the staged changes (visible in git diff --staged -- file) will stay, while unstaged changes (visible in git diff -- file) will be lost.

If you want to actually change the file back to HEAD (content a), then you'll have to specify that

git checkout HEAD -- file

Then both the unstaged and the staged changes will be undone.

(Which you could also do by first unstaging all changes with git reset -- file and then doing git checkout -- file; or git reset --hard to do both at the same time)


As to the differences to git restore - as that command is tagged THIS COMMAND IS EXPERIMENTAL. THE BEHAVIOR MAY CHANGE., it's kinda hard to make any meaningful prediction here.

However, based on current documentation it apparently also doesn't restore from HEAD by default, but from index (so it will keep the staged changes, just like git checkout -- file does)

But I did noticed some interesting difference - git checkout -- file will refuse to do anything if the file was deleted (or never existed) in the source. So if you git rm a file, then write contents to it - you can't delete it again by using git checkout -- file. git restore -- file apparently can be used to "restore" the file to its state of non-existance, i.e. deleting it.

Jay
  • 3,640
  • 12
  • 17