3

I am not sure what sort of git command I executed however when I am running git log I am getting following message

Merge branch 'xxx-2222' of https://github.com/aaa/my_repo_name into xxx-2222

I think I did following thing.

  1. I had a feature branch (xxx-1111), and I created a new feature branch out of it (xxx-2222).
  2. I then rebased my xxx-1111 branch from my development branch and then sqashed all the commits and then merged into developed.
  3. I started working on my new branch xxx-2222
  4. I rebase xxx-2222 from development branch and did git pull as well.

I think after this I am getting following message in log.

Would someone tell me what does this mean, and why it happened. How to possible trace it. And finally do I need to do anything to fix it before I merge it into develop.

Gaurang Shah
  • 11,764
  • 9
  • 74
  • 137
  • Not sure what exact reason behind the strange commit you are seeing but rebasing branches is not a good idea. It is not recommended too. If you are creating a new branch out of another branch, all you need to do is branch merge to keep them in sync. – Chetan Jun 27 '19 at 01:11
  • 1
    @ChetanRanpariya - what is your source for stating that it is "not recommended" to rebase branches in git? I tend to think of rebasing as a powerful tool for maintaining a clean and readable git history, and have used it frequently without any issues. – Alexander Nied Jun 27 '19 at 01:13
  • https://www.atlassian.com/git/tutorials/merging-vs-rebasing – Chetan Jun 27 '19 at 01:14
  • https://stackoverflow.com/questions/39154794/detailed-reason-why-remote-git-rebase-is-so-evil – Chetan Jun 27 '19 at 01:15
  • 1
    @Chetan IT's not true that rebasing is not a good idea. Actually it's *great* idea.... the general is rule is not to rebase a branch _that has been published already_ and that other people might have already started working on... then it is not so good. But consider a feature branch that you are working on.... if the branch is yours and until it is merged into master (or any other public branch, for that matter), you are free to rebase it as many times as you like, even if it has been published. – eftshift0 Jun 27 '19 at 01:16
  • 1
    @GuarangShah Without seeing your repo, my hunch is that you ran into issues here: _4. I rebase xxx-2222 from development branch and did git pull as well._ If you rebased the branch and then pulled from origin you essentially changed the local history then tried to merge back the old history. Just a hunch. Generally after a rebase you would not `pull` but instead force `push` in order to overwrite the old branch with the "new" branch (and if you want to be safe, you can place a tag before you rebase so that you can recover the original state). – Alexander Nied Jun 27 '19 at 01:22
  • 1
    `xxx-2222 of https://github.com/aaa/my_repo_name` and `xxx-2222` are not the same branch. The former is on the remote repository and the latter on the local one. Such commits can be avoided by `git pull origin -r xxx-2222`. With `-r`, it tries to rebase onto the remote branch instead of merging it. – ElpieKay Jun 27 '19 at 03:21

1 Answers1

7

TL;DR

You have run git pull. The git pull command means:

  1. Run git fetch for me.
  2. Run a second Git command for me, as soon as the fetch finishes.

That second command defaults to git merge and it is this git mergethat is causing your problem. Read through the long discussion below to see why.

I advise new Git users to avoid git pull. Run git fetch yourself. Then, if that's appropriate, run the second Git command—git merge, or git rebase, or whatever second command you might want to use if any—yourself. This gives you a chance to stop and look at what git fetch fetched, before you leap into running a second command that might not be appropriate after all.

Long

First, that's not the same branch. It's a different branch with the same name.

Analogies are terrible ways of reasoning, but sometimes they are useful. Imagine you're at a party and everyone there is named Bob. They all have the same name. Does that mean they are all the same person? Of course not—but they are all "Bob". You're just going to have to use some other name to distinguish them.

That's what Git is attempting here:

merge branch ... of ... into xxx-2222

The two blanks here get filled in with:

  • what they call their branch, and
  • the name you use to talk to them

so that you can later realize: Oh, that wasn't my xxx-2222, that was their xxx-2222. That's a little like realizing that you weren't talking to Bob Jones, but rather to Bob Smith.

So, let's talk a bit about naming things in Git.

Commits have unique (but ugly) names; humans use branch names

There is exactly one name you can count on every time, in any and every Git repository. That name is a hash ID. Hash IDs are the big ugly strings of hexadecimal digits that git log prints, such as 8dca754b1e874719a732bc9ab7b0e14b21b1bc10. These IDs are unique and never repeated, so that you either do have commit 8dca754b1e874719a732bc9ab7b0e14b21b1bc10 (which is a commit in the Git repository for Git), or you don't (presumably because you've never mingled your Git repository with one for Git itself: and unless you're going to write some code to modify Git, why would you?).

Every commit stores a snapshot of all of your files, plus some metadata. The metadata includes who made the commit—name, email address, and date-and-time-stamp—but also the raw hash ID of the commit that comes before the new commit. This means that every commit remembers its immediate predecessor or parent, by the raw hash ID.

What that means is that we can draw pictures of commits using backwards-pointing arrows, like this:

... <-F <-G <-H

Here, H stands in for the hash ID of the last commit we've made. It remembers the hash ID of the previous commit G. G in turn remembers the hash ID of commit F, and so on.

To use commit H, we'd have to memorize its big ugly hash ID. We don't need to remember G's any more, because that's in H. We don't need to remember F's, because we can use H to find G, and G has F's hash ID. This pattern goes on and on: all we need to do is remember the hash ID of the last commit.

But why should we remember it, when we have a computer? We can have the computer remember the hash ID. For instance, let's have the name master remember the hash ID of commit H, like this:

...--F--G--H   <-- master

Now when we go to make a new commit, we'll start with git checkout master—which gets us commit H to work on, and remembers that we're "on branch master", as git status would say—and we'll do some work, git add, and git commit. Git will save a new snapshot of all of our files, and come up with some new and unique big ugly hash ID, which we'll just call I. New commit I will remember the hash ID of commit H:

...--F--G--H
            \
             I

and then, as the last step of making the commit, Git will store the new hash ID into our name master, so that master points to I instead of H:

...--F--G--H--I   <-- master

I remembers H, which remembers G, and so on, so it's OK to have our master remember only the new hash ID for I.

Git is distributed, which means there are lots of separate repositories

You have two different Git repositories here. One is the Git repository on your own local machine, where you run git checkout and git add and git commit and so on. That repository is truly yours: you have 100% total control over it.

The other is over on GitHub. Technically that one is their (GitHub's) repository. They have handed most of the control of it over to you, so in most useful ways it's yours too, but it's more convenient—plus more technically accurate—to call it their repository, so let's do that here.

Repositories can share

These two repositories need not have the same commits, but in general, you probably want any new commits you make in your repository to get into their repository. If they have any new commits in theirs that you don't have in yours, you might want to get those into yours. To do this, you'll connect your Git to their Git, and have them exchange commits.

They will do this exchanging by hash IDs, because the hash IDs are truly universal. Their Git either does have some hash ID, or it doesn't. Your Git either does or doesn't have some hash ID. Any hash ID they have that you don't is some object that you can get from them, after which you both have it. Any ID you have that they don't is something you can give them, after which you both have it.

Git is very much built around this idea of adding to repositories. It is very easy to add their stuff to your Git, and to add your stuff to their Git. To get their stuff into your Git, you run git fetch (or git pull, which starts by running git fetch). To get your stuff into their Git, you run git push.

Sometimes, you don't really want to add stuff at all. Sometimes you want to get rid of some commits! You can do that—but Git isn't built for that, so it's harder. We'll see this in a bit.

With all this in mind, let's look at the anatomy of a rebase

The thing to know about git rebase is that it copies commits. That is, it takes some existing commit, extracts it, makes some change(s), and makes a new commit from that. This new commit is just that—a new commit—that leaves the old commit completely unaffected.

Git has to do this, because the hash ID of a commit—or any Git object, really—is just a cryptographic checksum of the contents of that commit. If you take a commit out and change it and make a new commit, you get a new, different commit. Because Git is built to add things, this just adds a new object to your repository. If you had:

...--F--G--H--I   <-- master

and you made a new slightly different copy of I whose parent is existing H, you get:

...--F--G--H--I
            \
             I'

where I' is the copy. None of the existing commits have been touched at all.

The tricky part is: what happens to the branch name? Well, in general, the reasons we use git rebase are either:

  • to take an existing series of commits that's OK as they are, but aren't based on the right starting-point, and copy them so that they are based on the right starting-point; or
  • to take an existing series of commits that aren't quite right, and improve them.

(Sometimes we do both at the same time, and there are a few other possibilities, but these are the big two reasons to rebase.)

That is, we might have:

...--F--G--H   <-- master
         \
          I--J--K   <-- feature

where commits I through K are just fine, but we'd like them to come after H instead of coming after G. The parents of commits are part of the frozen, unchangeable commits themselves, so to get what we would like to have, we must copy these three to new commits:

             I'-J'-K'   <-- feature
            /
...--F--G--H   <-- master
         \
          I--J--K   [abandoned in favor of the new improved feature]

But suppose that, before we do this git rebase, we have shared the original three commits—with their unique big ugly hash IDs—with some other Git, such as the one on GitHub? We sent them those commits and told them to set their branch name feature to remember hash ID K. They did that, and they, in their Git, still have these three old commits we'd like to abandon and be rid of.

(In our Git, we probably still have a name attached to commit K. That name is origin/feature. The graph really should be:

             I'-J'-K'   <-- feature
            /
...--F--G--H   <-- master
         \
          I--J--K   <-- origin/feature

But this isn't the critical part.)

Suppose we now have our Git call up their Git—the one at GitHub, that we're calling origin or https://github.com/... or whatever—and say: Hey, you other Git, tell me what commits you have, and what names you use for them. They'll say: I have master, it's commit H. And I have feature, that's commit K. So if we've actually gotten rid of K, we will now re-download K (and also I and J because we must get the whole chain).

If we run git fetch, we'll be sure to get I-J-K and update our origin/feature label so that we know that their feature names commit K, which we now / again / still share. But git pull doesn't just run git fetch.

git pull runs git merge (by default anyway)

The second step of a git pull is, by default, to run git merge. The git merge command takes some arguments to tell it what to merge, and git fetch provides them. In this particular case—if the illustration above matches your case—the arguments to git merge would be:

  • merge in commit K;
  • set the message in the resulting merge to merge branch 'xxx-2222' of https://github.com/aaa/my_repo_name into xxx-2222

This part—the please merge commit K now part of git pull—is because after the git fetch, your Git, has in your repo, these commits:

             I'-J'-K'   <-- xxx-2222
            /
...--F--G--H   <-- master
         \
          I--J--K   <-- origin/xxx-2222

So your Git dutifully finds the merge base of commits K and K' (which is commit G) and does all the work to perform and commit a merge, giving you:

             I'-J'-K'-------M   <-- xxx-2222
            /              /
...--F--G--H   <-- master /
         \       ________/
          I--J--K   <-- origin/xxx-2222

You'll now see what look like two copies of commits I, J, and K—because K' really is a copy of K, and J' really is a copy of J, and so on.

You have, in essence, told Git: Yes, I like my new and improved commits ... and I like my old ones too, so make me a merge that ties both sets together and make my branch name xxx-2222 point to new merge commit M.

What you probably wanted was to run git push --force-with-lease

Instead of running git pull—or its two components, git fetch and git merge—what you probably wanted to do when you had:

             I'-J'-K'   <-- xxx-2222
            /
...--F--G--H   <-- master
         \
          I--J--K   <-- origin/xxx-2222

was to have your Git call up their (origin's / GitHub's) Git and offer them your new commits I', J', and K'. These are the replacements you made with git rebase, that improve the original I-J-K sequence in some way. Then you'd like your Git to tell that other Git: And now, I think your xxx-2222 remembers K. If so, I command you to make your name xxx-2222 remember commit K'!.

If you have your Git end this git push operation with: I now politely request that you, GitHub-Git, set your name xxx-2222 so that it remembers commit K', they will say: No, I won't do that, because if I do that, I'll abandon my I-J-K commits. But of course, that's exactly what you want them to do.

The risk here is that they might now have I-J-K-L, in their repository. That is, their xxx-2222 might remember some new commit L that remembers commit K and so on. You can handle that risk by using this --force-with-lease option. That uses your origin/xxx-2222—your Git's memory of their Git's xxx-2222—to say I think your xxx-2222 is ....

You can use git push --force, which drops the I think ... if so... part of the command. That's the more dangerous, but even-more-forceful, kind of git push that is likely to make them go ahead and obey your command.

Conclusion

It's always important to keep in mind several things:

  • What does the commit graph look like?
  • Who (which Git) is remembering which commit hash IDs, under which names?

The git push command sends commits from your Git to another Git, and then asks or commands them to set some of their names to remember some commit hash ID (one hash ID per name). The git fetch command obtains commits into your Git from another Git, and then sets your origin/* or other remote-tracking names based on what your Git saw from their Git.

These are not completely symmetric! With git fetch, your remote-tracking names get updated, but that has no effect on your branch names. With git push, their branch names get updated—or they reject your polite request because the update would lose some commit(s).

Since git rebase copies commits, to new-and-improved ones where you now have your Git abandon the old and not-so-great ones in favor of the new-and-improved ones, you'll need to forcefully tell their Git to do the same: abandon some old not-so-great commits in favor of new-and-improved ones.

Note that when you do this—when you use git push --force or git push --force-with-lease—you're telling one Git repository to lose some commits. What if those commits have spread into more Git repositories? Everyone who has run git fetch to the GitHub repository has picked up all the new commits they can get from that repository. Your old and not-so-great commits may be spread far and wide now, and difficult to be rid of. Make sure everyone who might use them understands that you intended to revoke and replace them!

torek
  • 448,244
  • 59
  • 642
  • 775