This doesn't normally happen at all. But it has been temporarily happening a lot lately. This confusion is likely to go on for a while, and then maybe eventually stop, or get worse. Here is what's going on. For background, see any number of my longer "how commits and branch names work in Git" answers, such as this one.
Git branch names are required to contain the hash ID of some existing, valid commit.1 But when we first create a new, totally-empty repository, there are no commits. If a branch name has to hold a commit hash ID, and there are no commits, what hash ID should Git store in the branch name?
Git's answer to this is that, in a new, totally-empty repository, there cannot be any branches. That's a simple and neat answer, but it has a big flaw, because Git has one more requirement: the special name HEAD
shall contain the name of a branch (if you're on a branch), or the raw hash ID of some existing, valid commit (if you're in "detached HEAD" mode). We just said there are no commits, so you can't be in detached HEAD mode. Hence HEAD
needs to contain the name of a branch—and yet, we just said that there can't be any branches either.
Fortunately, there's no actual contradiction here, no paradox that requires some sort of magic state or something. The requirement is that HEAD
contain the name of a branch, not the name of an existing branch. Git just allows HEAD
to contain the name of a nonexistent branch, neatly sidestepping the paradox.
When you are in this state—on a nonexistent branch—Git will sometimes say that you are on an unborn branch, and sometimes will say that you are on an orphan branch. Git is not consistent about which phrase it uses, but it does generally use one of these two, or some variant of them.
Moreover, when you are in this state, the next commit you make is automatically a root commit. A root commit is simply a commit with no parents: an initial commit. Having now created that root commit, Git immediately stores the resulting hash ID into the branch name, thereby creating the branch as well, and the weird state is resolved. Your repository has one initial commit, and can now have an infinite number of branch names, all of which must select this one initial commit.
1"Existing" and "valid" are redundant; I use them here as a sort of emphasis. The idea is that Git only allows any given branch name to contain the hash ID of some commit for which git cat-file -t <hash>
will say commit
and for which git cat-file -p <hash>
will show you the internal commit data.
What's going on lately?
Why are we now seeing a rash of repositories with two, or sometimes more, root commits? The answer to that comes about by answering another question.
When we create a new, empty repository, what nonexistent branch name should HEAD
contain? That's up to the git init
command. In the past, git init
would always create a new, empty repository in which the repository is on the nonexistent branch named master
.
Suppose we use a Git-Repository-Storage-Site like GitHub or Bitbucket to store repositories. These sites generally require that we use some kind of clicky-button web interface to create new repositories.2 Suppose further that we also use git init
locally, on our laptops, to create new, empty repositories—or run git clone
to copy the empty repository from GitHub, or wherever, to our laptops.
We now have to examine multiple cases.
2GitHub, at least, finally, at long last, provide a command line script, gh
, that can do this sort of thing without having to click on annoying web interface buttons. Of course, if you like clicking on web interface buttons, perhaps they're not annoying.
Case 1: cloning an empty repository
The general form of git clone
is, or at least can be described as:3
git clone [ -b <branch> ] [ -c config-options ] <url> [ <path> ]
This is, in effect, shorthand for the following six operations, five of which are Git commands; the five Git commands are run in the new directory:
mkdir path
: This creates a new, totally-empty directory in which Git can create a repository. If you omit the path
argument, Git will figure out one based on the URL.
git init
: because this is run in a new, totally empty directory, it creates an empty repository: one with no commits and therefore, necessarily, no branches. Which branch are we on anyway? If step 6 goes well, that won't matter, but let's proceed.
git remote add origin url
. This saves the URL you supplied, under the standard name origin
.
As many git config
s as are needed to save the configurations you specified. (It's important to list this step because it might affect the next one. In particular, single-branch clones, not otherwise covered properly here, do some special configuring.)
git fetch origin
: this calls up some other Git software, at the stored URL. We now obtain, from that other Git repository, all of their commits. The fetch command causes the other Git to list out all of their branch names as well, which our Git renames to turn them into remote-tracking names.
Last, our Git uses our -b
argument to run git checkout
or git switch
. If we supply a -b
argument, it must be the name of a branch that exists in the other repository.4 That's the branch that our Git will create, locally, and then check out, using the commit hash ID found by looking at their Git's branch names. That is, our Git looks at our remote-tracking names, which are based on their branch names, and picks out the right renamed branch name to get the right commit hash ID to check out our branch.
If we omit the -b
flag, our Git is supposed to ask their Git which branch name they recommend. This part is subtly broken in current versions of Git. The recommended name is, quite simply, the one stored in HEAD
. If you use a web interface to change the default branch in some GitHub repository, what you're really doing is having GitHub stuff a different name into the HEAD
in that clone.
Before we look at the "clone an empty repository" case, let's consider cloning a more typical, non-empty repository. There are two possibilities here:
We provide a -b
argument, and they have such a branch; that's the branch we're on, and the commit we get checked out. (If they don't have the branch we name, our git clone
gives us an error and erases the directory it made, so that it looks like nothing even started up.)
We fail to provide a -b
argument, but they recommend their standard branch name—whatever is is—and that's the branch we land on.
Hence, courtesy of step 6, we get some valid branch name in our HEAD
and all is good. We're working with some existing commit; there's already an initial (root) commit and we will just add new commits as usual.
But hold on: We are cloning a completely empty repository. It has no branch names. What branch name do they recommend? Can they recommend a branch name at all?
The answer to that last question—can they recommend a branch name, even when they don't have any branch names—is and has been "yes, they can" for about a decade. This means there are still some people using ancient versions of Git that can't, and therefore don't. Fortunately, hosting sites are not so old and creaky, so they can recommend a name. But now we hit that "subtly broken" part I mentioned.
This is being fixed, but at the moment, even with modern Git versions, our Git doesn't get and use a name from their Git. So if their Git picks, say, main
, and our Git picks master
, when creating that empty repository, here's what happens with the clone process:
- We (locally) execute steps 1-5 just fine.
- We then (locally) fail to execute step 6. Our local repository remains on whatever initial branch name we use, while their (GitHub's, or Bitbucket's, or whoever's) repository remains on whatever initial branch name they use. They also continue recommending this name, because it's still in their
HEAD
.
If "they" are GitHub, and they now use main
as their standard first branch name, but we are using a version of Git that uses master
as our standard first branch name, we and they both have different "unborn branches".
We now make our first commit. This creates our branch master
, which we use with git push
. This creates the name master
in the repository over on the web site—GitHub, or whatever—but still leaves them recommending main
.
This is a recipe for confusion. It doesn't actually cause two separate root commits, but they're recommending a branch name we don't use, and we're using a branch name they don't recommend, at this point.
3The command offers a lot of other options, such as -o
, --no-checkout
, and so on, which can mess with slightly, but the overall plan remains in place. The biggest one for our purposes is --no-checkout
, which omits step 6 entirely.
4Alternatively, the argument to -b
can be a tag name, which—if this succeeds—will result in a detached-HEAD clone, but again we'll just ignore this case.
Case 2: two non-empty initial repositories
I'm not entirely sure what to call case 2, but I have seen people do this:
- They run
git init
on their laptops, creating an empty repository that is on branch master
.
- They then create an initial commit in this laptop repository.
- Meanwhile, they use the GitHub (or whatever site) web interface to create a new repository, but choose to create a non-empty repository: one with one commit containing template-sourced LICENSE, README, and/or other initial files. This creates a branch named
main
.
They now connect their laptop repository to their GitHub repository, using git remote add origin url
, and run git push origin master
or similar. They also run git fetch origin
(before or after git push origin master
).
They now have, in their laptop and GitHub repository, two initial commits, not related to each other, on two different branch names.
They don't understand that they have just done this. Git doesn't complain: this is actually a perfectly valid situation. The set of commits in a repository is not required to have only one root commit. The commit graph in a repository can contain multiple disjoint subgraphs.
Your situation looks like the latter
Suppose we use this method, of creating two non-empty repositories: one on GitHub with a main
, and one on a laptop with a master
.
If we never get around to pushing master
, but do create a dev
branch using the name master
, our new dev
branch on the laptop will be using this same initial commit. Note that we can also run:
git checkout -b dev
when we're in the unborn-master
-branch state. The next commit we make will be a root commit.
There is a giveaway. When we make a new commit, Git prints messages:
$ mkdir empty && cd empty
$ git init
Initialized empty Git repository in ...
$ echo root > README
$ git add README
$ git commit -m initial
[master (root-commit) ea0681d] initial
1 file changed, 1 insertion(+)
create mode 100644 README
Note the (root-commit)
in the output. This indicates that this git commit
command created the current branch name: we were in the orphan/unborn state a moment ago.
Subsequent commits omit the (root-commit)
part here. We know that these commits are adding on to whatever commits we have so far.
The only way to be sure about the commit graph, though, is to use something that shows the commit graph. See Pretty Git branch graphs (and note that --oneline
has a subtle flaw where it can look like two separate graphs are conjoined, when they are not).