The thing to realize about Git is that it is only commits that matter. Commits are what Git is all about. The commits themselves find the other commits, in a twisty little ball of commits, once you get into the commits. So: what are branch names good for? It's not nothing, but it's kind of close.
The real name of a commit is its hash ID. But commit hash IDs seem random, and there is no way to predict what the hash ID of some commit is. Once you find one commit, you can use that commit to find more commits. But you have to find one of them first, somehow—and that's where a branch name comes in. A name lets you get started. It gets you in to the nest of commits. From the name, you can now find the hash ID of some particular commit. That commit lets you find another commit, which lets you find still another commit, and so on.
Now all my code is on the 'master' branch which is not the main branch, so I was wondering how I could move everything to the 'main' branch?
The TL;DR here is that you're in a tricky situation and there is no single right answer. You will have to decide what you want to do. You can:
- rename your own
master
branch to main
and try to get all other users of clones of the original repository to use your commits; or
- figure out how to combine and/or re-do some or all commits in the two repositories.
In other words, all you might have to do is rename the branch. But there is definitely still some problem, because right you now have two branch names. It's time to take a closer look at this whole thing: why is it the commits that matter, and how do these names really work?
Long
Let's start with the simplest form of related commits: a small, simple, linear chain. Suppose we create a new, totally-empty repository with no commits in it. There's a rule about Git branch names: a branch name must hold the hash ID of exactly one (1) existing, valid commit.1 Since there are no commits, there can be no branch names.
To fix this problem, we make our first commit. If you use GitHub, they'll often make that first commit for you, creating one with just a README and/or LICENSE type file in it. Having that first commit allows you to create as many branch names as you like: they'll all store that one commit's hash ID.
Note that every commit gets its own unique hash ID. This hash ID is universal across all Git repositories everywhere.2 This is why Git hash IDs are as big and ugly as they are.3 It also allows Git programs to connect to other Git programs that are using other Git repositories, and figure out which commits each repository has, just by exchanging hash IDs. So the hash IDs are crucial. But they're quite useless to humans, who can't keep them straight. So that's why we have branch names.
There is one other thing to know about these hash IDs and the underlying objects (commits, and the non-commit objects that Git stores, mentioned in footnote 1): the hash IDs are simply fancy checksums of the stored object. Git looks up the object—the commit, or its related data—using the hash ID, but then also makes sure that the stored object's checksum matches what it used to look it up. So no part of any stored object, in Git, can ever change. If the checksum does not match, Git declares the storage to be corrupted, and refuses to proceed.
Anyway, let's say we started with one commit, one one branch named bra
, and then created two more commits, so that we now have a tiny repository with just three commits in it. Those three commits have three big ugly hash IDs, unique to those three commits, but we'll just call them commits A
, B
, and C
. Let's draw them like this. Each element in this drawing has a purpose:
A <-B <-C <--bra
Commit C
stores two things: a snapshot of every file, and some metadata. The snapshot acts as the main commit's data and lets you get back all the files, as of whatever form they had at the time you (or whoever) made commit C
. The metadata include the name of the person who made the commit, their email address, and so on; but crucially for Git itself, the metadata in commit C
include the hash ID of earlier commit B
.
We say that commit C
points to B
. By reading out commit C
, Git can find the hash ID of earlier commit B
.
Commit B
, of course, also contains data—a full snapshot of every file—and metadata, including the hash ID of earlier commit A
. So from B
, Git can find A
.
Commit A
is a bit special, because it was the first-ever commit. It has no backwards-pointing arrow leading to any earlier commit, as there was no earlier commit. Git calls this a root commit. It lets Git stop going backwards.
The commit we need to use to find all other commits, in this repository, is commit C
. To find commit C
, we use the branch name, bra
. It contains the hash ID of commit C
, so bra
points to C
, and that's how we get started.
1There's no such thing as an existing but invalid commit. The point of saying "existing, valid commit" is really that hash IDs are used for more than just commits, so you could have a valid hash ID, but for something that's not a commit. But you won't be dealing with these non-commit hash IDs yet, if ever. You do have to deal with commit hash IDs, so those are the ones we care about.
2Technically, two different commits could have the same hash ID as long as those two Git repositories never meet. A commit meeting its doppelgänger causes tragedy and sadness, so that's bad. (Well, technically, what happens is that the two Gits, as they're having Git-sex so as to exchange commits, simply malfunction. The sadness is in the users of those Gits, who expected some sort of beautiful baby.)
3As of a few years ago, even this is starting to become insufficient. See How does the newly found SHA-1 collision affect Git? for details.
Adding new commits on one branch
Given that we have:
A <-B <-C <--bra
we start by extracting commit C
into a work area. The contents of each commit can't be changed, and that includes the stored files.4 So now we have commit C
"checked out". Git uses the name bra
to remember the hash ID of C
, and knows that the current commit has this hash ID.
We now make any changes we like: add new files, delete existing files, update files, and so on. We inform Git about these updates with git add
.5 Then we build a new commit with git commit
. Git saves away the new snapshot, and adds the appropriate metadata, including the current commit's hash ID, to produce a new commit D
that points back to existing commit C
:
A <-B <-C <--bra
\
D
As the last step of git commit
, Git stores the latest commit's hash ID into the branch name. Since commit D
points back to existing commit C
, we now want to start our view of the repository, via the branch named bra
, by looking at commit D
:
A <-B <-C <-D <--bra
and the commit is now complete.
4The files' contents are stored as blob objects inside the repository. This compresses them and de-duplicates them, so that when two commits share the same file contents, they literally share the internal objects. You don't normally need to know or care about this, though.
5The git add
step manipulates the thing that Git calls, variously, its index, or the staging area, or (rarely these days) the cache. To save space in this answer, I leave out all the useful details.
Multiple branch names
To use more than one branch, we normally add a new branch name, using git branch
and git checkout
, or combining the two with git checkout -b
(or in Git 2.23 or later, git switch -c
). The way this actually works is that it just creates the new branch name, pointing to the same commit as the current commit:
A--B--C--D <-- bra, nch
We now have two branch names but both select the same commit. Right now, it does not matter which name we use, because both names select commit D
. But in a moment, it will become important—and Git always wants to be able to tell us which branch we're "on", so that git status
can say on branch bra
or on branch nch
. To make that work, Git attaches the special name HEAD
to one branch name, like this:
A--B--C--D <-- bra (HEAD), nch
or this:
A--B--C--D <-- bra, nch (HEAD)
Whichever name has HEAD
attached to it, that's the current branch name. Whichever commit this name points to, that's the current commit.
Now we'll create a new commit in the usual way. It gets a new unique hash ID, but we'll just call it commit E
, to keep our sanity: only a computer can handle the real hash IDs. Let's draw it in:
A--B--C--D <-- bra
\
E <-- nch (HEAD)
The branch name that got updated is nch
, because that's our current branch. The current commit is now commit E
, and that's the commit we have checked out.
If we git checkout bra
, or git switch bra
in Git 2.23 or later, we choose bra
as our current branch and commit D
as our current commit. So commit D
becomes the one checked out:
A--B--C--D <-- bra (HEAD)
\
E <-- nch
Now any new commit we make will update the name bra
:
F <-- bra (HEAD)
/
A--B--C--D
\
E <-- nch
This is the sort of branching we usually do, in a Git repository. Note that commits A-B-C-D
are on both branches, because no matter which name we start with, when we work backwards, we find all those commits. But the only way to find commit E
is to start with the name nch
. The only way to find commit F
is to start with the name bra
.
Branch names find commits
This is what branch names are good for. They find the starting—well, ending?—commit of the branch. In fact, that's how branches are defined, in Git. The name holds the hash ID of the last commit on the branch. Whatever hash ID is in the name, that's the last commit, even if there are more commits. When we have:
F <-- bra
/
A--B--C--D <-- main
\
E <-- nch
there are three last commits, even though there are two commits after D
. There are three ways to find commits A-B-C-D
, too: we can start with the name main
and work backwards, or we can start with either of the other two names and work backwards.
How history relates
Suppose we have this:
I--J <-- br1
/
...--G--H
\
K--L <-- br2
We can pick either of these two branch names—and hence either commit J
or commit L
—and then ask Git to merge the other last commit. Without going into any of the rest of the important details, the way Git handles this merge request is to work backwards to find the best shared commit, which in this case, is commit H
. The merge then proceeds using commit H
as the merge base.
This all works because the two branch tip commits, J
and L
, are related: they have a shared parent (well, grand-parent, in this case). This shared parent is a common starting point. They can therefore be converted to changes since the common starting point.
Changing a branch name is trivial
Each Git repository has its own private branch names. When you hook two Git repositories to each other, what really matter—because they can't change and uniquely identify the commits—are the commit hash IDs. So if we have:
A--B--C <-- bra (HEAD)
we can just arbitrarily change this name to any new name we like:
A--B--C <-- xyzzy (HEAD)
Nobody cares whether the name is bra
or xyzzy
or whatever—well, except for irrational humans, who have ideas pop into their heads when we use evocative names, like plugh
or colossal-cave-adventure
. And, when we're using Git clones to share work, we humans like to share our branch names too, to help keep our own sanity. So we don't normally go about renaming branches willy-nilly. But the actual names really don't matter, not to Git at least.
If this were your own situation—you have a master
, they changed the name to main
—you could just rename your master
to main
yourself, and you and they would both use the same name to find the same commits. This would be easy and simple. It's not your situation, though, because for this to be your situation, you would not be seeing that complaint about unrelated histories.
More than one root commit
All of the diagrams above have only one root commit: in our case, commit A
. (Well, the ...--G--H
probably has a single root commit.) But there are a bunch of different ways, in Git, to create extra root commits. One method is using git checkout --orphan
(or git switch --orphan
). Suppose we start with:
A--B--C <-- bra (HEAD)
and then use this technique to create a new root commit D
, that doesn't point back to C
, or to anything, named nch
:
A--B--C <-- bra
D <-- nch (HEAD)
This works fine in Git, and we can go on and create more commits if we like:
A--B--C <-- bra
D--E--F <-- nch (HEAD)
What we can't do, now, is simply merge these two branches, because git merge
needs to find the best common ancestor. Git does this by starting at each end and working backwards until the histories meet ... and in this case, they never meet! One history ends (starts?) at A
, and the other ends (starts?) at D
, without ever coming across the same commit on both branches.
Multiple repositories
With all of the above in mind, let's add clones into the picture. Remember that each Git repository is, essentially, two databases:
One database contains commit objects, and other internal Git objects. Each object has a big ugly hash ID as its key, and Git looks up the actual values in a simple key-value datastore.
The other database has names—branch names, tag names, and other such names—each of which stores one hash ID. These hash IDs get you into the commits, so that you can find all the commits.
When you run git clone url
, you have your Git create a new, empty repository, with no commits and no branches in it, then call up some other Git and have that Git look at some other repository, based on the URL you gave. That other Git has its two databases: commits and other objects (keyed by hash ID), and name-to-hash-IDs (keyed by names). They send, to your Git, all the objects, which your Git puts into your own database.
You now have all their commits, and none of their branch names.
In order to find these commits, your Git takes their branch names and changes them. Instead of, say, master
or main
, your Git makes up names like origin/master
or origin/main
. These names are your Git's remote-tracking names. They remember the hash IDs that their Git had in their branch names.
These remote-tracking names work just as well to find commits. You don't actually need any branch names at all, just yet. But git clone
has not quite finished: its last step is to run git checkout
(or git switch
), to pick some branch name for you.
Of course, you have no branches yet, but git checkout
/ git switch
has a special feature: if you ask Git to check out a name that does not exist, your Git scans your remote-tracking names. If they have a master
, you now have an origin/master
, and when you try to git checkout master
, your Git will create your own new master
, pointing to the same commit as your origin/master
. That, of course, is the same commit as their master
!
This means you now have, in your own repository:
A--B--C <-- master (HEAD), origin/master
Now, suppose they change their name master
to main
. If that's all they do—if they just rename their branch—you'll end up with this, after you run git fetch
to get any new commits from them (there are none) and update your remote-tracking names:
A--B--C <-- master (HEAD), origin/master, origin/main
Your Git adds origin/main
to your repository, to remember their main
. They have, in effect, deleted their name master
, and your Git probably should delete your origin/master
to match, but the default setup for Git does not do this.6 So you end up with two remote-tracking names, one of them stale. You can clean this up manually with:
git branch -d -r origin/master
or:
git fetch --prune origin
(The git fetch
has the side effect of updating all your remote-tracking names right then, including getting any new commits from them, so that's usually better. It takes longer though, as it has to call up their Git over the Internet, or wherever the URL goes.)
6To make Git behave this way, for all your repositories, use git config --global fetch.prune true
.
If they'd done that, things would be reasonable
Suppose they did do just that: rename their master
to main
, without actually adding or deleting any commits. Or, they might do the renaming, and then add more commits. Let's draw the latter: it's a bit more complicated but it all works out the same, in the end.
They had:
A--B--C <-- master
and you ran git clone
and got:
A--B--C <-- master (HEAD), origin/master
in your own repository. (We can leave out the HEAD
in their repository because we don't normally care which branch they check out.) Then they rename their master
to main
and add commits D-E
. You run git fetch
and get:
A--B--C <-- master (HEAD), origin/master
\
D--E <-- origin/main
Your Git fails to delete origin/master
, even though they have no master
any more, so we leave it in the drawing. Note that it's harmless: it just marks commit C
. We can delete it—we can set fetch.prune
or run git fetch --prune
or whatever—or leave it; it's not really important. Branch names don't matter! Only commits matter. Commit C
is still there, whether or not there's a name pointing to it.
Anyway, perhaps you make your own new commit F
:
F <-- master (HEAD)
/
A--B--C
\
D--E <-- origin/main
If you ask your Git to merge commits F
and E
, it works, because they have a common ancestor: F
's parent is C
, and E
's parent's parent is C
.
This tells us that this is not what they did.
What seems to have happened instead
If we assume that you did not make a bunch of unrelated commits, what must have happened, in their Git repository—over on GitHub—is that they made a new root commit, and used the name main
to find it:
A--B--C <-- master
D <-- main
Then, they probably deleted their name master
. That left them, in their repository, with this:
A--B--C ???
D <-- main
At this point—or just before it—they may or may not have copied some or all of their A-B-C
commits to new commits that come after D
:
A--B--C ???
D--B'-C' <-- main
Here, commit B'
is a copy of commit B
: it does to D
whatever B
did to A
. Likewise, C'
is a copy of C
, doing to B'
whatever C
did to B
. The new commits have new and different hash IDs and point backwards to commit D
as their root, though. So when you run git fetch
to connect your Git to their Git, their new commits are these D-B'-C'
ones, so that you, in your repository, wind up with:
A--B--C <-- master (HEAD), origin/master
D--B'-C' <-- origin/main
If you delete your origin/master
(since their master
is gone), nothing really changes: your own Git is still finding commit C
. Their Git can't find commit C
—they may even have thrown it away by now; Gits eventually delete un-find-able commits—but your Git can, through your master
. If you've made new commits since then, like the F
we drew earlier, you even have this:
F <-- master (HEAD)
/
A--B--C <-- origin/master
D--B'-C' <-- origin/main
You can't do a merge because these chains have no shared history.
So what can you do?
You are now faced with a bunch of choices. Which ones to use depend on how much work you want to do, how much work you want to make other people do, and how much control you have over the other Git repositories.
You can:
Keep using your commits (only) and force everyone else to switch.
There was no reason to change the commits. The originals are still just as good as they ever were. Someone made a mistake, copying them. Make them eat their mistake: rename your master
to main
, use git push --force origin main
, and make the GitHub (or other central storage server) repository use your commits, under the name main
that everyone has agreed-to.
Copy the commits of yours that you like, adding them to the end of their last commit.
Assuming that their commit C'
has the same saved snapshot as your (and originally their) commit C
, or whatever commit it is that's the last copy of an original, you can probably just add your work after C'
, using git cherry-pick
for each commit, or git rebase --onto
to do multiple cherry-pick operations. See other StackOverflow questions for how to do that.
Merge with --allow-unrelated-histories
.
This technique can take the least time and effort on your part, but it could be messy and painful: the rebase / cherry-pick option in the middle may be faster and easier. All that --allow-unrelated-histories
does is pretend that, before the separate root commits, there was a single commit with no files in it. In some cases, this works easily. In most cases, you get a bunch of "add/add conflicts" that requires a lot of manual work.
It also has the rather ugly side effect of leaving extra, mostly-useless commits in your repositories, which you then carry around forever. If nobody looks at this history (and the two roots), nobody will care, but it's still there. Whether it bothers you (or others) is another question entirely.
There's no way I can pick one of these options for you, and this isn't necessarily the universe of all options, but by this point you should at least have a good understanding of what happened, and why these are ways to deal with it.