1

I was trying to raise a pull request to merge a feature branch with a integration branch, but they had conflicts to resolve this I did

git checkout integration
git rebase feature
#resole conflicts
git rebase --continue  

now I am stuck with

Your branch and 'origin/integration' have diverged,
and have 7 and 2 different commits each, respectively.

I want my feature branch to be merged with integration by using rebase. is it possible? or the only solution to achieve this is

git checkout feature
git merge integration 
# resolve conflicts --merge commit
git push
#raise pull request

Most answers I see reset the branch to it's head, but that doesn't solve my problem, see answer to reset

I know rebase is better than merge, but how do I go about doing that here?

Lucas
  • 33
  • 7
  • I tried that, but now I am stuck with `Your branch and 'origin/feature' have diverged, and have 7 and 6 different commits each, respectively.` @TTT – Lucas Feb 17 '21 at 04:10
  • It's perfectly normal for your branch to diverge after a rebase. I just changed my comment to an answer and also explain how to modify your push command. – TTT Feb 17 '21 at 05:51

3 Answers3

2

It sounds like you would prefer rebase, and therefore would want the first set of commands, but you have the integration and feature branches swapped. You need to checkout feature instead, because that's the branch you're modifying, and then git rebase integration, or usually better, git rebase origin/integration.

Here are the commands you would run:

git fetch
git checkout feature
git rebase origin/integration
# resolve any conflicts
git rebase --continue
git push --force-with-lease

Note the push command has the option --force-with-lease on it. That is a "force push" with an extra check attached to it. This is how you push out your branch if you have previously pushed, and you have modified any of the commits on your branch after the initial push. When you do a rebase, you are re-writing those commits, and changing their commit ID (hash), which is why you see the message that your branch has diverged. This is completely normal after doing a rebase. It also happens when you amend a commit. As for force push, the basic command is:

git push --force

That means: Blow away my branch on the remote and replace it with my local branch. IMHO you rarely want to use --force. Instead I would usually prefer this option:

git push --force-with-lease

That means: Blow away my branch on the remote and replace it with my local branch, but only if there aren't any commits on my remote branch that I haven't fetched yet. (In case you are sharing your branch with someone else and they pushed out new commits that you didn't know about yet. Or, maybe you pushed out some commits to your branch from another repo and forgot about it!)

There is a relatively new option as well:

git push --force-if-includes

More info on that here.

TTT
  • 22,611
  • 8
  • 63
  • 69
2

I know rebase is better than merge ...

Why do you know this? Is it "better"? What does "better" actually mean?

now I am stuck with

Your branch and 'origin/integration' have diverged,
and have 7 and 2 different commits each, respectively.

This is exactly the sort of thing git rebase does. If that's what you want, then git rebase may well be better. If that's not what you want, git rebase may well be worse.

As TTT noted in a comment (then converted to an answer), it seems likely that you would want to rebase the feature onto the last commit on integration or origin/integration. That, too, will cause a divergence (of feature vs origin/feature), if you earlier did a git push origin feature—because that's the kind of thing git rebase does.

In the end, though, you should make your own decision about what's better for your purposes and your work-flows. This may change over time. Know what git merge does, and know what git rebase does, and choose the one that suits whatever purpose you have.

Long: Background, or, what to know before you merge or rebase

The main thing to know about Git is that a repository is, in essence, all about commits. It's not about files—although commits contain files—and it's not about branches, though branch names help you (and Git) find commits. It's really all about the commits themselves. This means you need to know what a commit is and does for you.

Before we look at what the commit is, though, let's look at the lowest level way in which Git finds a commit. That way is by its hash ID. The hash ID of a commit is the commit's number: Every commit, in Git, is numbered.

These numbers are not simple counting numbers, though. We don't have commit #1 followed by #2, then #3, and so on. Instead, each number is very large and expressed as a big ugly string in hexadecimal, like 328c10930387d301560f7cbcd3351cc485a13381 for instance. These numbers look random, but really aren't: that particular number—the one starting with 328c1093—is now reserved to that one particular commit in the Git repository for Git, and will never be used for any other commit, ever.1

So the hash ID of any commit is totally unique to that one commit. It's formed by computing a cryptographic hash over all the bits in the commit,2 so that every Git will compute the same ID for the commit. As a consequence, no part of any commit can ever be changed. Git has a simple key-value database that stores all of its objects—commit or otherwise—by hash ID; Git finds the object, commit or otherwise, by looking up the hash ID in the database. If you copy one of these objects out of the database, change any of the bits, and put the result back, what you get is not a changed object: instead, the old object remains, and the new object has a new and different hash ID.

We'll see why this matters when we get to git rebase. For now, just remember: no commit can ever change. Once made, it's stuck the way it is.


1Technically, your own Git could re-use this number for one of your commits, but only if you never hook your Git repository up to a clone of the Git repository for Git. In practice, that hash ID will never be re-used anywhere else. In a sense, it was reserved for that commit even before that commit ever happened. Making this work requires extremely long hash IDs, and in some ways, the 160-bit SHA-1 IDs that Git uses today are now too short. Git is moving towards using 256-bit SHA-256 hash IDs in the future, which should give at least several more decades of breathing room: SHA-1 was fine in 2000, and even OK in 2010, but now in 2021 it's starting to look a bit cramped.

2Technically, it's the commit minus the GPG signature, if any, since the hash ID has to be computed separately, before the GPG signature is known. So there's a small loophole for GPG-signed commits.


What's in a commit

Each commit consists of two parts, both of which are required:

  • Every commit contains a full snapshot of every file, as of the state (and/or file contents depending on how you think of files) that that file had at the time you, or whoever, made the commit. More precisely, it has a snapshot of all the files known to Git (the so-called tracked files).

    The files inside the commit are in a special, read-only, Git-only format. They're compressed and de-duplicated. The de-duplication takes care of the fact that when you make a new commit, you usually haven't changed most of the files. Git won't really have to save a whole new copy because it can just refer to some existing saved copy. In fact, these files are stored as something other than files: their content is stored as Git objects (using that same hashing system) in the object database, and their names are stored via yet more Git objects. So they're not quite files in the sense that your OS has files. But they can be extracted to files, later.

  • Meanwhile, every commit contains some metadata, or information about the commit itself. This includes the name of the person who made it, and their email address, and the date-and-time-stamps that go with these. The metadata include a log message, which git log will show (or show the subject-line from, when using --oneline). They can include various other things too, not all of which you'll see directly. The main thing that they include for Git is this: Git stores, in each commit metadata, the hash ID(s) of some other, earlier commit(s).

These earlier commits—or commit, singular; there's usually just one—are the parents of the commit. Since most commits have just the one parent, what we get, if we draw the commits, looks like this:

... <-F <-G <-H

Here, each uppercase letter stands in for the real hash ID. The last commit—written on the right side of the line—is commit H here (with H standing for hash ID). By looking up hash ID H, Git will find the commit itself, with both the metadata and the snapshot.

The snapshot lets Git get you all the files back, as of the form they had at the time you, or whoever, made commit H, at any time in the future. That is in fact what git checkout or git switch does: it extracts all the files from some commit, so that you can see and use them.

The metadata lets Git tell you who made the commit, but it also lets Git step back to the previous commit. In our case we've drawn the previous commit as G. The arrow coming out of H points to G, though in reality, commit H just contains, in its metadata, the hash ID of earlier commit G. (We called it G here because the letter G is one before the letter H. In reality we have random-looking hash IDs, with no obvious connection between them.)

Note that commit G, too, has a snapshot. If Git extracts both snapshots and compares them, Git can use that to tell you what, if anything, is different in H. That's how Git can show you changes. Git didn't store changes—it stored two snapshots, here—but by putting the two snapshots side by side and playing a game of Spot the Difference, Git can tell you what changed.

But commit G has metadata too, and in that metadata, Git has saved the hash ID of a still-earlier commit, which we're calling F. So Git can move from H to G, and now compare F-vs-G to show you what changed in G.

Having moved back one step to G, Git can now move back one more step to F. F points back to some still-earlier commit, so Git can show you what changed, and then move back another step, and so on. This keeps up forever—with git log just going on and on—until, eventually, Git arrives that the first ever commit. That first ever commit doesn't list a parent commit hash ID, and that's how git log knows to stop going backwards.

So this is how commits work. Each one has data—a snapshot—and metadata, and this forms the commits into backwards-looking chains. By starting at the end, at the last commit—in our case, commit H—Git can work backwards. That's how Git works: backwards, from the last commit back to the first. But there is one great big hitch. Where do we get that last commit hash ID? Git needs the hash ID of some commit to start with, to work backwards from.

Branch names

We need to store the last commit hash ID somewhere. While we could write this down and then type them in again, that would be really obnoxious and painful. Besides, we have a computer. It should do the work for us: save the last hash ID in a file, for instance, and then get it back when we need it. This is what branch names do.

Let's add a branch name to our drawing. Suppose the branch name is main (the new default on GitHub):

...--F--G--H   <-- main

The branch name simply points to—or more precisely, contains the hash ID of—the last commit in the chain. Whatever hash ID is in the name, that's the "last" one. That's true even if the chain keeps going.

Let's see how this works by adding another branch name, such as feature:

...--F--G--H   <-- feature, main

At the moment, both names point to commit H. But if we make a new commit on feature, this new commit—we'll call it commit I—will point back to H, and Git has to update the name feature to point to I:

...--F--G--H   <-- main
            \
             I   <-- feature

There's one other thing to add to our drawing: How did Git know which branch name to update? To keep track of this, we'll attach the special name HEAD to one branch name. That makes the "before new commit" picture look like this:

...--F--G--H   <-- feature (HEAD), main

and the "after new commit" picture looks like this:

...--F--G--H   <-- main
            \
             I   <-- feature (HEAD)

Note how HEAD remains attached to the name feature at all times. The name moves, as we add new commits: if we add another new commit J, which will point back to I, we'll get:

...--F--G--H   <-- main
            \
             I--J   <-- feature (HEAD)

The name now points to the last commit, which points back to I, which points back to H, which points back to G, and so on.

That's really pretty much it: a branch name lets Git find the last commit, to git checkout or git switch to. When you do that, Git fills in your work area with the files that are stored in that commit, and attaches HEAD to that name. Then, as you make new commits, those commits extend whatever branch HEAD is attached to. So if we go back to main we get:

...--F--G--H   <-- main (HEAD)
            \
             I--J   <-- feature

Our updated files are safely tucked away in commit J; we're back to the older versions that are safely tucked away in commit H, now. If we make a new branch—say, integration or feature2 or whatever—and check that one out, HEAD will be attached to that one, which will still select commit H:

...--F--G--H   <-- feature2 (HEAD), main
            \
             I--J   <-- feature

and we can make new commits on this one:

             K   <-- feature2 (HEAD)
            /
...--F--G--H   <-- main
            \
             I--J   <-- feature

Note also how everything we do just adds new commits here. No existing commit changes, because no existing commit can change.

Merging

With the above in mind, let's take a very brief look at what git merge does. We won't consider its many special cases, but only the most general one. This general case occurs when we have two branches that have diverged from some common starting point, as in this drawing:

             I--J   <-- branch1 (HEAD)
            /
...--F--G--H
            \
             K--L   <-- branch2

Note how commits up through and including H are on both branches, while commits I-J and K-L are each on only one of the two branches. The last commit on branch1 is J and the last commit on branch2 is L. We have branch1, and therefore commit J, checked out.

If we now run git merge branch2, Git uses the name branch2 to locate commit L. That's the commit to merge. At this point, the branch names are all irrelevant, because merging—like so much else in Git—really works based on commits.

The goal of this merge is to combine work. Git wants to help us combine the work we did on branch1 with the work someone (maybe us again, or maybe someone else—this all depends on how we got the commits) did on branch2.

Since commits hold snapshots, Git is going to have to compare some set of commits, to figure out who changed what. The only way to make this all work is to find some shared starting point. That could be commit G, because commit G is on both branches. But so is commit H. One of these two commits is "better", and that's the one that's "closer to the ends". Technically Git is using a Lowest Common Ancestor algorithm (the LCA of a Directed Acyclic Graph) to find commit H here, but in this case, that's the commit that Git finds. Git calls this the merge base.

Git now runs the equivalent of:

git diff --find-renames <hash-of-H> <hash-of-J>

to figure out what we changed, and a second git diff:

git diff --find-renames <hash-of-H> <hash-of-L>

to figure out what they changed. Git then combines these two sets of changes. The resulting added-together changes will, if applied to the snapshot in commit H, keep our changes but add theirs too. So that's what Git does: apply the combined changes to the snapshot from the merge base.

If all goes well, Git makes its own new commit from the resulting snapshot. This new commit is a merge commit. What makes a commit a merge commit is that it has more than one parent. That's it—that's the only thing special about a merge commit: that it has two or more parents.3

One of the two parents is the same parent as always: new merge commit M points back to commit J, since that's the commit we are using when we start this process. The other parent is simply the other commit we told git merge to use: commit L.4 The snapshot for the new merge commit is the one built by applying the combined changes to the snapshot in the merge base H. So the result looks like this:

             I--J
            /    \
...--F--G--H      M   <-- branch1 (HEAD)
            \    /
             K--L   <-- branch2

Note that, as always, no existing commits have changed. We have simply added one new commit—a merge, M, that points back to both J and L—to our existing set of commits, and moved the current branch name forward to point to the new commit as usual.


3Git calls a commit with three or more parents an octopus merge. They're not really more useful than a two-parent merge. Other version control systems would require you to do multiple two-parent merges to get there, and that works fine in Git too.

4The normal-backwards-link first parent of a merge is always the commit we started with. The git log command has a --first-parent option that lets you ignore the other branch entirely. This is often useful later. I won't go into any detail here, but git log will sometimes, but not always, follow both parents of a merge, and the details get quite tricky.


Cherry-picking

Before we get to git rebase, we should make a stop at git cherry-pick. The reason for this is that git rebase is mostly a series of git cherry-pick operations. If you understand what each cherry-pick does, that makes it a whole lot easier to understand what rebase does.

Let's start with a similar drawing, although cherry-picking works in many situations, not just this particular one. Here we have two feature branches being developed more or less simultaneously, perhaps by two different groups:

          A--B--C--D--E   <-- feature1
         /
...--o--o--o--o--o   <-- main
               \
                F--G--H   <-- feature2 (HEAD)

Let's say that you are working on feature2 yourself, and are currently on commit H. (The round os are also commits. We just have not bothered to give them one-letter names.) You've hit a stumbling block where you need some code that, you heard recently, the folks working on feature1 already wrote.

They put that code in as commit C on their branch. That is, if we compare commits B and C, the change from B to C is exactly what you need in your branch.

There are a lot of ways to handle this case, but for whatever reason—expediency, perhaps—you decide that you'd just like to take their change, from B to C, and add that change to your commit H and make a new commit. We'll call this new commit C' because we will not only copy their change but also their commit message, and we'll mark our new commit with their name too, so that it looks so much like commit C that it's clearly a copy of C.

It will have a different hash ID, of course, because it will be a different commit. Its parent will be commit H, not commit B, and the parent hash ID of some commit is part of its metadata, hence is incorporated into its own hash ID. Also, while the author of C' will be them, the committer of C' will be you: every commit actually has two people's names on it for just this kind of purpose. When you make your own commits, both names are "you"; when you copy someone else's, they are the author, and you are the committer.

Anyway, in order to make this commit, we run:

git cherry-pick <hash-of-C>

Git will locate commit C (using the hash ID we give it), use C to find B, compare B-vs-C to see what they changed, and add that to your current commit H. Technically, Git accomplishes this using the same code as git merge. Instead of finding a merge base commit, though, the "merge base" is simply forced to be commit B. Git compares B-vs-C to see what they changed, and B-vs-H to see what you changed, then adds those changes together and applies them to the snapshot in B.

This may feel weird (it did to me), but if you work through this mathematically / algorithmically, you'll see that it actually does exactly the right thing. Knowing that Git is using the merge machinery to achieve the cherry-pick "copy a commit" trick will help you if and when you see merge conflicts during a cherry-pick: the conflicting changes are those from B-to-C for >>>>>>> theirs, and those from B-to-H for <<<<<<< ours.

If you don't see conflicts—which is often the case—Git will finish the cherry-pick on its own and make the new commit C':

          A--B--C--D--E   <-- feature1
         /
...--o--o--o--o--o   <-- main
               \
                F--G--H--C'  <-- feature2 (HEAD)

The difference from H to C', if you git show C' for instance, will be the same as the difference from B to C when you git show C. The log message for new commit C' will be a copy of the log message from commit C, too, and the author of C' will be the same person as the author of C. That's why we're calling this new commit C', after all: it looks almost exactly the same as C. It just has a different hash ID, and when git log works backwards from C', it goes to H, not to B.

Rebasing

As mentioned above, the essence of rebasing is repeated cherry-picking. To see how and why, let's take a look at a typical reason to do a rebase.

Suppose we start with this:

             G--H--I   <-- branch (HEAD)
            /
...--D--E--F   <-- mainline

Now, for whatever reason, and in whatever way,5 we pick up some new commits on the main-line branch:

             G--H--I   <-- branch (HEAD)
            /
...--D--E--F---------J--K   <-- mainline

There's no particularly strong reason to draw commit J that far to the right, and in the rest of the drawings below, I won't bother: we picked it up after making G-H-I but maybe whoever made it, made it about the same time we made G. The times don't really matter though. It's what's in the commits—both as snapshots, and as the graph we're drawing—that matters.

If our branch is done now, we could just merge the branch into the mainline. That's the simplest option, by far, and it results in this:

             G--H--I   <-- branch
            /       \
...--D--E--F--J--K---M   <-- mainline (HEAD)

I skipped the letter L to go for M-for-merge here. The first parent of M will be K and the second parent will be I, because we'll make this merge with git switch mainline and then git merge branch.

The reason to use merge is because it's simple and reflects reality. The reason not to use merge is because it means that when we look back at what happened, we see exactly what happened: that branch got developed as a side branch, and meanwhile commits J and K happened, and then we merged them. This extra detail is—probably! we hope!—just a distraction. What if we could, instead, see:

...--D--E--F--J--K--G--H--I   <-- mainline

as if we had somehow known in advance that J-K was going to go in, and had waited and written G after K was done, to build on K? The future person going and looking at this history sees a greatly simplified history.

The downsides to doing this are two:

  1. This isn't really what happened. Note the words probably and we hope in our statement above. We hope that this simplified view of what happened is good enough, and not distracting. It's probably good enough (and definitely non-distracting).

  2. We literally can't do that. Commit G has commit F as its parent.

The first point is purely philosophical, and is where you must make up your own mind. But the second point is purely technical: enter git rebase. What git rebase does is copy some commits, then move a branch name.

Let's start with what we have now:

             G--H--I   <-- branch (HEAD)
            /
...--D--E--F--J--K   <-- mainline

Now, let's create a new temporary branch. We'll need to make up a name for it, like tmp (but we'll fix that in a moment). We'll make it point to commit K, which is where we want our copies to go after. We'll check out commit K and make the temporary branch point there, like this:

             G--H--I   <-- branch
            /
...--D--E--F--J--K   <-- mainline, tmp (HEAD)

Now—or maybe even just before we create the temporary branch—we'll also list out the hash IDs of commits G, H, and I. How do we know which hash IDs to list out? We'll start from branch, which points to I, and keep working backwards, one commit at a time, until we hit commits that are already reachable from commit K, where mainline points. Since K points back to J which points back to F, that's commits I, H, and G.

This list is in the wrong order, but that's easy to fix: we just need to reverse it.6 So now we have the list of commits to copy: G, then H, then I, in that order. Git drops this list of commit hash IDs into a hidden temporary file.7

We now check out our temporary branch and begin copying commits, one at a time, from the list. That is, we run git cherry-pick G, more or less. This makes a copy G'. If everything works, we now have this:

             G--H--I   <-- branch
            /
...--D--E--F--J--K   <-- mainline
                  \
                   G'  <-- tmp (HEAD)

We do a second git cherry-pick for H now. Assuming it works, we get:

             G--H--I   <-- branch
            /
...--D--E--F--J--K   <-- mainline
                  \
                   G'-H'  <-- tmp (HEAD)

We repeat this until we have copied all commits:

             G--H--I   <-- branch
            /
...--D--E--F--J--K   <-- mainline
                  \
                   G'-H'-I'  <-- tmp (HEAD)

and then we finish the git rebase process by moving the name branch, which is where HEAD was when we started, to point to the last copied commit I', and re-attach HEAD to branch:

             G--H--I   ???
            /
...--D--E--F--J--K   <-- mainline
                  \
                   G'-H'-I'  <-- branch (HEAD)

Because git log works by looking at branch names to find commits to show, we won't see the old I commit any more. That means we won't see H, which means we won't see G. If we don't notice that the hash IDs are different, we won't realize that these copies are in fact copies of the originals: we wrote the original commits, and we made the copies, so we're still author and committer. The log messages are the same. The diffs, from K to G', then G' to H', then H' to I', are the same as they were from F to G and so on.

The end result is that it seems as though we waited and then wrote our code after we got commit F. This is the illusion we chose to project.

There's a minor but important detail here. When rebase is doing its copying, one commit at a time, it doesn't actually make up a tmp branch. Instead, it uses what Git calls its detached HEAD mode. In this mode, the special name HEAD is not actually attached to any branch name at all. It just points directly to some commit.

This mode is used with git switch --detached to look at a historical commit, and it's also used by git rebase. In general, you don't want to do new work in this mode, but the stuff we do in git rebase is involved in copying old work, so that's mostly OK. The reason to know that rebase uses this is because the copying steps can fail, sort of: each cherry-pick has the potential to have a merge conflict. If it does have a merge conflict, it stops with an error message. You have to clean up the error and then continue the rebase. While you're doing the cleanup, you will be in this detached HEAD mode.

If you forget that you are in the middle of a conflicted rebase, you may find yourself in detached HEAD mode and not know why. The git status command is helpful here: it will tell you that you are in this mode, and that you are in the middle of a conflicted rebase. That is, it will say that in any modern Git. There are still some systems out there using ancient versions of Git, where git status is less helpful. Try to avoid those ancient versions (upgrade Git if possible, or upgrade the entire OS, or whatever). If you're stuck with one, just don't forget that you're in the middle of something.

When rebase finishes—or if you end it with git rebase --abort to put everything back the way it was, as if you'd never started the rebase—Git will go back to its previous mode: attached HEAD, on the unchanged branch, assuming you started in attached-HEAD-mode on some branch. You will only see this while you're fixing things up, not once you're done.

So that's what git rebase does: it copies some set of commits, abandoning the original set. But why does this result in the have diverged message? Unfortunately, I have run out of space, and will have to leave that to other answers.


5I was going to include a bit here on git fetch etc., but I have run out of space.

6In some more general cases, we will need to put the commits into a topologically sorted order, using some graph theory, but for this simple case that amounts to just reversing the list.

7Each copy is a cherry-pick, and each cherry-pick can stop with an error, that you have to clean up. If Git does stop like this, it needs the file, along with progress information, to be able to resume the rebase, with git rebase --continue.

torek
  • 448,244
  • 59
  • 642
  • 775
0
- - - - - - - - - (integration branch which also has feature commits)
    |
    - - (origin/integration)

Something like this has happened.

  • If you alone have access to the repository, then you can force push the integration branch -- but this is a lazy and dirty way of doing things. I do not recommend you do this with a public repository.

  • You could merge your new feature branch with origin/integration and then push it or you could rebase on top of origin/integration and push that.........unsure what the best approach would be.

BenKoshy
  • 33,477
  • 14
  • 111
  • 80