TL;DR
You have run git pull
. The git pull
command means:
- Run
git fetch
for me.
- Run a second Git command for me, as soon as the fetch finishes.
That second command defaults to git merge
and it is this git merge
that is causing your problem. Read through the long discussion below to see why.
I advise new Git users to avoid git pull
. Run git fetch
yourself. Then, if that's appropriate, run the second Git command—git merge
, or git rebase
, or whatever second command you might want to use if any—yourself. This gives you a chance to stop and look at what git fetch
fetched, before you leap into running a second command that might not be appropriate after all.
Long
First, that's not the same branch. It's a different branch with the same name.
Analogies are terrible ways of reasoning, but sometimes they are useful. Imagine you're at a party and everyone there is named Bob. They all have the same name. Does that mean they are all the same person? Of course not—but they are all "Bob". You're just going to have to use some other name to distinguish them.
That's what Git is attempting here:
merge branch ... of ... into xxx-2222
The two blanks here get filled in with:
- what they call their branch, and
- the name you use to talk to them
so that you can later realize: Oh, that wasn't my xxx-2222, that was their xxx-2222. That's a little like realizing that you weren't talking to Bob Jones, but rather to Bob Smith.
So, let's talk a bit about naming things in Git.
Commits have unique (but ugly) names; humans use branch names
There is exactly one name you can count on every time, in any and every Git repository. That name is a hash ID. Hash IDs are the big ugly strings of hexadecimal digits that git log
prints, such as 8dca754b1e874719a732bc9ab7b0e14b21b1bc10
. These IDs are unique and never repeated, so that you either do have commit 8dca754b1e874719a732bc9ab7b0e14b21b1bc10
(which is a commit in the Git repository for Git), or you don't (presumably because you've never mingled your Git repository with one for Git itself: and unless you're going to write some code to modify Git, why would you?).
Every commit stores a snapshot of all of your files, plus some metadata. The metadata includes who made the commit—name, email address, and date-and-time-stamp—but also the raw hash ID of the commit that comes before the new commit. This means that every commit remembers its immediate predecessor or parent, by the raw hash ID.
What that means is that we can draw pictures of commits using backwards-pointing arrows, like this:
... <-F <-G <-H
Here, H
stands in for the hash ID of the last commit we've made. It remembers the hash ID of the previous commit G
. G
in turn remembers the hash ID of commit F
, and so on.
To use commit H
, we'd have to memorize its big ugly hash ID. We don't need to remember G
's any more, because that's in H
. We don't need to remember F
's, because we can use H
to find G
, and G
has F
's hash ID. This pattern goes on and on: all we need to do is remember the hash ID of the last commit.
But why should we remember it, when we have a computer? We can have the computer remember the hash ID. For instance, let's have the name master
remember the hash ID of commit H
, like this:
...--F--G--H <-- master
Now when we go to make a new commit, we'll start with git checkout master
—which gets us commit H
to work on, and remembers that we're "on branch master", as git status
would say—and we'll do some work, git add
, and git commit
. Git will save a new snapshot of all of our files, and come up with some new and unique big ugly hash ID, which we'll just call I
. New commit I
will remember the hash ID of commit H
:
...--F--G--H
\
I
and then, as the last step of making the commit, Git will store the new hash ID into our name master
, so that master
points to I
instead of H
:
...--F--G--H--I <-- master
I
remembers H
, which remembers G
, and so on, so it's OK to have our master
remember only the new hash ID for I
.
Git is distributed, which means there are lots of separate repositories
You have two different Git repositories here. One is the Git repository on your own local machine, where you run git checkout
and git add
and git commit
and so on. That repository is truly yours: you have 100% total control over it.
The other is over on GitHub. Technically that one is their (GitHub's) repository. They have handed most of the control of it over to you, so in most useful ways it's yours too, but it's more convenient—plus more technically accurate—to call it their repository, so let's do that here.
Repositories can share
These two repositories need not have the same commits, but in general, you probably want any new commits you make in your repository to get into their repository. If they have any new commits in theirs that you don't have in yours, you might want to get those into yours. To do this, you'll connect your Git to their Git, and have them exchange commits.
They will do this exchanging by hash IDs, because the hash IDs are truly universal. Their Git either does have some hash ID, or it doesn't. Your Git either does or doesn't have some hash ID. Any hash ID they have that you don't is some object that you can get from them, after which you both have it. Any ID you have that they don't is something you can give them, after which you both have it.
Git is very much built around this idea of adding to repositories. It is very easy to add their stuff to your Git, and to add your stuff to their Git. To get their stuff into your Git, you run git fetch
(or git pull
, which starts by running git fetch
). To get your stuff into their Git, you run git push
.
Sometimes, you don't really want to add stuff at all. Sometimes you want to get rid of some commits! You can do that—but Git isn't built for that, so it's harder. We'll see this in a bit.
With all this in mind, let's look at the anatomy of a rebase
The thing to know about git rebase
is that it copies commits. That is, it takes some existing commit, extracts it, makes some change(s), and makes a new commit from that. This new commit is just that—a new commit—that leaves the old commit completely unaffected.
Git has to do this, because the hash ID of a commit—or any Git object, really—is just a cryptographic checksum of the contents of that commit. If you take a commit out and change it and make a new commit, you get a new, different commit. Because Git is built to add things, this just adds a new object to your repository. If you had:
...--F--G--H--I <-- master
and you made a new slightly different copy of I
whose parent is existing H
, you get:
...--F--G--H--I
\
I'
where I'
is the copy. None of the existing commits have been touched at all.
The tricky part is: what happens to the branch name? Well, in general, the reasons we use git rebase
are either:
- to take an existing series of commits that's OK as they are, but aren't based on the right starting-point, and copy them so that they are based on the right starting-point; or
- to take an existing series of commits that aren't quite right, and improve them.
(Sometimes we do both at the same time, and there are a few other possibilities, but these are the big two reasons to rebase.)
That is, we might have:
...--F--G--H <-- master
\
I--J--K <-- feature
where commits I
through K
are just fine, but we'd like them to come after H
instead of coming after G
. The parents of commits are part of the frozen, unchangeable commits themselves, so to get what we would like to have, we must copy these three to new commits:
I'-J'-K' <-- feature
/
...--F--G--H <-- master
\
I--J--K [abandoned in favor of the new improved feature]
But suppose that, before we do this git rebase
, we have shared the original three commits—with their unique big ugly hash IDs—with some other Git, such as the one on GitHub? We sent them those commits and told them to set their branch name feature
to remember hash ID K
. They did that, and they, in their Git, still have these three old commits we'd like to abandon and be rid of.
(In our Git, we probably still have a name attached to commit K
. That name is origin/feature
. The graph really should be:
I'-J'-K' <-- feature
/
...--F--G--H <-- master
\
I--J--K <-- origin/feature
But this isn't the critical part.)
Suppose we now have our Git call up their Git—the one at GitHub, that we're calling origin
or https://github.com/...
or whatever—and say: Hey, you other Git, tell me what commits you have, and what names you use for them. They'll say: I have master
, it's commit H
. And I have feature
, that's commit K
. So if we've actually gotten rid of K
, we will now re-download K
(and also I
and J
because we must get the whole chain).
If we run git fetch
, we'll be sure to get I-J-K
and update our origin/feature
label so that we know that their feature
names commit K
, which we now / again / still share. But git pull
doesn't just run git fetch
.
git pull
runs git merge
(by default anyway)
The second step of a git pull
is, by default, to run git merge
. The git merge
command takes some arguments to tell it what to merge, and git fetch
provides them. In this particular case—if the illustration above matches your case—the arguments to git merge
would be:
- merge in commit
K
;
- set the message in the resulting merge to
merge branch 'xxx-2222' of https://github.com/aaa/my_repo_name into xxx-2222
This part—the please merge commit K
now part of git pull
—is because after the git fetch
, your Git, has in your repo, these commits:
I'-J'-K' <-- xxx-2222
/
...--F--G--H <-- master
\
I--J--K <-- origin/xxx-2222
So your Git dutifully finds the merge base of commits K
and K'
(which is commit G
) and does all the work to perform and commit a merge, giving you:
I'-J'-K'-------M <-- xxx-2222
/ /
...--F--G--H <-- master /
\ ________/
I--J--K <-- origin/xxx-2222
You'll now see what look like two copies of commits I
, J
, and K
—because K'
really is a copy of K
, and J'
really is a copy of J
, and so on.
You have, in essence, told Git: Yes, I like my new and improved commits ... and I like my old ones too, so make me a merge that ties both sets together and make my branch name xxx-2222
point to new merge commit M
.
What you probably wanted was to run git push --force-with-lease
Instead of running git pull
—or its two components, git fetch
and git merge
—what you probably wanted to do when you had:
I'-J'-K' <-- xxx-2222
/
...--F--G--H <-- master
\
I--J--K <-- origin/xxx-2222
was to have your Git call up their (origin's / GitHub's) Git and offer them your new commits I'
, J'
, and K'
. These are the replacements you made with git rebase
, that improve the original I-J-K
sequence in some way. Then you'd like your Git to tell that other Git: And now, I think your xxx-2222
remembers K
. If so, I command you to make your name xxx-2222
remember commit K'
!.
If you have your Git end this git push
operation with: I now politely request that you, GitHub-Git, set your name xxx-2222
so that it remembers commit K'
, they will say: No, I won't do that, because if I do that, I'll abandon my I-J-K
commits. But of course, that's exactly what you want them to do.
The risk here is that they might now have I-J-K-L
, in their repository. That is, their xxx-2222
might remember some new commit L
that remembers commit K
and so on. You can handle that risk by using this --force-with-lease
option. That uses your origin/xxx-2222
—your Git's memory of their Git's xxx-2222
—to say I think your xxx-2222
is ....
You can use git push --force
, which drops the I think ... if so... part of the command. That's the more dangerous, but even-more-forceful, kind of git push
that is likely to make them go ahead and obey your command.
Conclusion
It's always important to keep in mind several things:
- What does the commit graph look like?
- Who (which Git) is remembering which commit hash IDs, under which names?
The git push
command sends commits from your Git to another Git, and then asks or commands them to set some of their names to remember some commit hash ID (one hash ID per name). The git fetch
command obtains commits into your Git from another Git, and then sets your origin/*
or other remote-tracking names based on what your Git saw from their Git.
These are not completely symmetric! With git fetch
, your remote-tracking names get updated, but that has no effect on your branch names. With git push
, their branch names get updated—or they reject your polite request because the update would lose some commit(s).
Since git rebase
copies commits, to new-and-improved ones where you now have your Git abandon the old and not-so-great ones in favor of the new-and-improved ones, you'll need to forcefully tell their Git to do the same: abandon some old not-so-great commits in favor of new-and-improved ones.
Note that when you do this—when you use git push --force
or git push --force-with-lease
—you're telling one Git repository to lose some commits. What if those commits have spread into more Git repositories? Everyone who has run git fetch
to the GitHub repository has picked up all the new commits they can get from that repository. Your old and not-so-great commits may be spread far and wide now, and difficult to be rid of. Make sure everyone who might use them understands that you intended to revoke and replace them!