-2

Do i get it right, because there is a lot of confusion in the internet, stackoverflow.com and even git-scm.com manual.

In the first place:

  1. HYPOTHESIS: You don't really checkout branches - you check out COMMITS!!! By checking out branches git means "check out the head(commit) of the branch." But you may as well checkout commit that is not a head(lowercase intended) of any branch - you may still refer to it by HEAD(uppercase intended) in commands, but we may say this is a so called "detached head/HEAD" to pinpoint "yes, it is checked out commit(HEAD) of branch X, but it is not a head(lowercase intended) of branch X"

What HEAD(written in uppercase) is:

  1. HEAD is a reference to CHECKED OUT commit, usually it is the commit at the tip of some branch(referred to as head(lowercased) - one of many head, or in some contexts as a branch itself(does i get it right?)), but whenever you check out commit that is not a head you may still refer to it as a HEAD, implicitly detached HEAD, it is so called detached HEAD. [SOURCE]

What HEAD(written in uppercase) is NOT:

  1. HEAD is NOT ALWAYS a reference to the last commit in the currently checked-out branch.

What head(written in lowercase) is:

  1. Head is named reference to the commit at the tip of a branch [Source] By head we may mean one of tips of the many repo branches - does I get it right?

What head(written in lowercase) is NOT:

  1. head is NOT a reference to CHECKED OUT commit

What detached HEAD(written in uppercase) is:

  1. detached HEAD is a HEAD(uppercased) that is not the tip of any branch

Moreover:

  1. there is an entity such as detached HEAD but we should never use a phrase detached head(lowercased) as it would make no sense regarding "head" definiton.

My wild confusion arouse when I started reading confusing github explanations contradicting each other, git-scm.com and the behaviour of "git checkout HEAD~1" - it checks out parent commit to currently checked out commit, not parent commit to commit being head i.e. tip of the branch. It made me angry because all of them were so much upvoted I thought I could trust them, turned out - not at all.

torek
  • 448,244
  • 59
  • 642
  • 775
Puti
  • 1
  • `it checks out parent commit to currently checked out commit, not parent commit to commit being head i.e. tip of the branch` I can't see how it contradicts the definitions you have found – The Dreams Wind Jun 16 '22 at 09:02
  • "HEAD is NOT ALWAYS a reference to the last commit in the currently checked-out branch". Yes it is always. If a branch is checked out, of course. If you checkout an older commit in your branch, your branch isn't "the currently checked-out branch" any more. Also, I don't get why you define "head" like it's something different than "branch". These are synonyms. – Romain Valeri Jun 16 '22 at 09:54
  • I didn't summarize what I found contradicting in the first place, I don't think it is gonna be helpful in any way for me to learn, it may let me rant at these answers at best. But if you wish: [link](https://stackoverflow.com/questions/2529971/what-is-the-head-in-git/2529982#2529982) and [link](https://stackoverflow.com/questions/3689838/whats-the-difference-between-head-working-tree-and-index-in-git#comment3889372_3689838) and [link](https://stackoverflow.com/questions/3689838/whats-the-difference-between-head-working-tree-and-index-in-git/3690522#3690522) – Puti Jun 16 '22 at 09:57
  • @Romain how does it refer to my point 1. hypothesis? – Puti Jun 16 '22 at 09:59
  • Checking out a branch is something more than checking out a commit. Yes in both cases a commit is checked out eventually. But when a branch is checked out (in opposition to checking out a commit directly), git also knows it has to do certain tasks when you commit, like moving the branch's tip, updating its reflog, and so on. – Romain Valeri Jun 16 '22 at 10:01
  • May I suggest the following [answer](https://stackoverflow.com/questions/2304087/what-is-head-in-git/67862196#67862196) from the SO thread _What is HEAD in git?_ which clearly describes what **HEAD** is and how _detached_ and _attached_ state relates to each other. – Alexis Määttä Vinkler Jun 16 '22 at 11:11

1 Answers1

0

HEAD, written in all uppercase, is special to Git.

head, written in all lowercase or mixed case, is not special to Git. (Also, humans use it, as in the phrase "the head of branch X", to mean what Git calls the tip commit of that branch. Humans use it to mean HEAD in all uppercase, when they're too lazy to press the SHIFT key. Humans are often sloppy and mistaken, too, so if you find a human saying that the sky is green or that up is down, it could just be human error. You can't trust humans. )

HYPOTHESIS: You don't really checkout branches - you check out COMMITS ...

This is partially, or even mostly, true. But: when you use git checkout or git switch to select a branch by name, that branch becomes the current branch. There is a mechanism for this, and this mechanism runs into some problems sometimes; I'll come to this in a moment.

When you use git checkout --detach or git switch --detach to select a commit by hash ID or non-branch name, that commit becomes the current commit. There is a mechanism for this as well.

The underlying mechanisms and the problems involved

Git stores branch names (refs/heads/main and the like), tag names (refs/tags/v1.2 and the like), remote-tracking names (refs/remotes/origin/main and the like), and so on, in a name-to-hash-ID lookup database. If this database were a true database (some sort of SQL or MongoDB or Berkeley DB or whatever instance), we might not have any issue here, but it's not: the "database" was originally a really simple and somewhat cheesy system in which refs/heads/branch was just the file refs/heads/branch in the file system in the .git directory where the repository was stored.

This is very simple and works fine for a small number of branches and tags, but when you start getting 40,000 tags in a repository (various Google projects) it becomes inefficient. So Git grew a second way to store branch and tag names: a flat file named .git/packed-refs may contain lines, with each line giving a branch or tag name and a hash ID (and, for annotated tags, the "fully peeled" hash ID, though it's stored as second line).

It doesn't work for HEAD though, because HEAD is normally a symbolic ref. An "attached" HEAD—the case where HEAD contains a branch name—was originally implemented as a symbolic link. So .git/HEAD would be a symlink: if .git/HEAD was a symlink to refs/heads/branch, branch was the current branch.

This does not work on Windows (many versions of Windows anyway), which lacks symbolic links. So Git grew a new mechanism to handle this: .git/HEAD could be, and now always is, an ordinary file containing the literal text ref: refs/heads/branch (plus a newline) to indicate that you're on branch branch.

To enter detached HEAD mode, Git would replace the symbolic link .git/HEAD with a file .git/HEAD that would contain the raw hash ID of the current commit. This is still used today: if you're in detached HEAD mode on commit 9c897eef06347cc5a3eb07c3ae409970ab1052c8, .git/HEAD contains that string (plus a newline).

So that's the current situation: a file named .git/HEAD (spelled in all uppercase) contains either ref: refs/heads/<current-branch> or a hash ID. But—hang on a moment—a few years ago Git grew the git worktree facility. There's no longer a working tree and a (single) HEAD; there's now one HEAD per work-tree. So .git/worktrees/ will contain auxiliary HEAD files. These also will contain branch names (prefixed with ref: and fully spelled out) or hash IDs.

Each added work-tree has its own HEAD, in other words. But it turns out there are additional work-tree-specific refs: the bisection refs, for instance, need to be per-work-tree so that git bisect good and git bisect bad can be run in any one work-tree without affecting any other.

So our situation is now complicated. There may be files underneath .git/refs/ that contain branch, tag, and other names and hash IDs; these are either shared across all working trees, or are for the main work-tree, with the work-tree-specific hash IDs living somewhere under .git/worktrees/. There may or may not also be one or more packed-refs files for the main working tree and added work-trees (though I don't believe there are packed refs for work-tree-specific refs). And there's exactly one HEAD, in all uppercase, for each work-tree, with the main work-tree's HEAD having the special magic reserved .git/HEAD name.

This all works just fine on your typical Linux system, where the file system is case-sensitive. We can have a branch named foo and one named FOO. These are two different branches, because Git is case-sensitive. If the refs are stored in .git/packed-refs, they're case-sensitive there. If the refs get unpacked into .git/refs/heads/foo and .git/refs/heads/FOO, they're case-sensitive there too. It all works fine.

But on your typical macOS or Windows system, the file system is case-insensitive (albeit case-preserving). If we unpack refs/heads/FOO from .git/packed-refs, and thus create .git/refs/heads/FOO, and then try to unpack foo too, we'll overwrite .git/refs/heads/FOO with the new hash ID for branch foo. Git will think everything went fine, until later, it goes to use foo or FOO.

Git has a simple rule: if the unpacked file can be opened and read, that file provides the answer. This way the unpacked files override the .git/packed-refs flat-file data, meaning that if we had a packed FOO branch and we unpacked it and updated it, we don't have to edit out the packed refs/heads/FOO lines. But on Windows and macOS, any attempt to open .git/refs/heads/foo opens .git/refs/heads/FOO instead, and we get the hash ID for the FOO branch instead of the hash ID for the foo branch.

How does this affect HEAD?

If we use the uppercase HEAD, it works fine. Suppose we are in added working tree #2 and we call for HEAD, in all uppercase. Git, being case-sensitive internally, intercepts the name HEAD and discovers that it's not in the main working tree and therefore looks for .git/worktrees/<something>/HEAD and gets the right file. This contains the detached-HEAD hash ID or the branch name (ref: refs/heads/branch) and everything works.

If we use the lowercase head, though, it breaks. Git tries to open .git/head. The file .git/HEAD exists, so Git uses that. That contains ref: refs/heads/main, because the main working tree has branch main checked out. So Git now believes it should turn main into a raw hash ID to know which commit to work with.

Hence, in our added work-tree in which branch is checked out:

git rev-parse HEAD
git rev-parse branch

both produce the same (correct) hash ID, but:

git rev-parse head
git rev-parse branch

produce the hash IDs for main and branch respectively.

Lowercase head works when you don't have any added working trees and no branch, tag, or other names that would match head in a "bad" way. It breaks when you do have added working trees since it very quickly matches the main working tree's HEAD file. See the six-step process for resolving a name as outlined in the gitrevisions documentation and note which step is step 1.

The future

Git has been—for multiple years now—acquiring an implementation called "reftables" (which apparently is modeled after a JGit implementation that's been in use for many more years) in which branch names will be stored in a database. This will eliminate the case-folding issues on typical macOS and Windows systems. It will also mean that lowercase head will cease to work where it does work now. So don't use it.

torek
  • 448,244
  • 59
  • 642
  • 775