PR for branch made from previous branch that was squashed and merged shows previous branch's commits

Question

Note: Sorry for the title. I'm unsure how to word my challenge succinctly, so flagging this question as a duplicate (accurately) would be very helpful.

Here's a timeline of relevant actions which will form the basis for my question:

PR is opened for Feature A to master
Feature B is branched from Feature A, work continues.
Feature A is squashed and merged into master.
PR is opened for Feature B to master

Problem: the PR for Feature B shows all previous (unsquashed) commits from Feature A.

How do I, preferably without manually dropping all Feature A commits or cherry-picking Feature B commits, rebase Feature B on master and show in the PR only the commits from A..B?

@DanSchnau I wish it were so easy. At the very least you get a boatload conflicts. Even working through them, the commit history for the Feature B PR on github still shows all of Feature A's commits. — Adam Terlson, Dec 22 '17 at 15:08
Since the branch B is from branch A, it contain all the commits of branch A. When you will try to make a Pull Request for branch B to Master, it will try to merge all the commits of branch A and B. Now, since some of the commits from branch A is already merged, it will show the rest of the commits. So, what are you observing is normal. Now there is no quick solution available (according to my knowledge). You can create a branch using cherry pick or manually remove the non required commits from branch B. — Md Monjur Ul Hasan, Dec 22 '17 at 15:11
Cherry-picking or dropping commits one-by-one will not work for my case. There's got to be a better way. — Adam Terlson, Dec 22 '17 at 15:34
@MdMonjurUlHasan I'm aware that what happens is normal, expected behavior. The question is how best to deal with the situation when what's normal isn't what's desired. Cherry-pick/drop is not a viable solution in my case. — Adam Terlson, Dec 22 '17 at 15:35
Is there a way to, say, drop all commits between master and the first commit of Feature B? — Adam Terlson, Dec 22 '17 at 15:40
@AdamTerlson: I've posted a complete answer, but the really short executive summary is: *there **isn't** a (useful) "first commit" of feature-B*. Feature-B includes all the commits that are on feature-A, and most of the commits in all of history. You can instead use `--onto` to separate the two parts that `git rebase` needs so that you can tell it where to stop searching for commits to copy. — torek, Dec 22 '17 at 19:12

score 1 · Accepted Answer · answered Dec 22 '17 at 19:10

TL;DR

You must cherry-pick, even though you don't want to. You can do this cherry-picking in a highly automated and often—but not always—easy and painless fashion using git rebase --onto.

Description

GitHub itself is, as far as I know (which is not all that far), completely useless here. You can do what you need to do in shell-level Git, though.

Brief background review: When you build a branch in Git, what you are really doing is adding commits, generally one at a time. Git's basic unit, and raison d'être, is the commit. Each commit is uniquely identified by a hash ID like 95ec6b1b3393eb6e26da40c565520a8db9796e9f. No two different Git objects ever have the same hash ID. Each commit is almost a standalone entity, containing a complete snapshot of your source code. The "almost" part comes about because most commits contain, as part of their metadata, the hash ID of one previous commit, which we call the commit's parent commit. A branch name like feature-A contains the hash ID of one single commit, which Git calls the tip commit of the branch.

When you git checkout feature-A, make edits, git add the files, and git commit the result, you create a new commit. The new commit's parent is the commit that was the tip, that you had git checkout-ed. Its snapshot is all the files that were in the original commit except for those that git add overwrote with the new content that you edited. Being a completely new commit, it gets a new, unique hash ID, and Git then stores the new ID into the branch name, so that the new commit you just made is now the tip commit of feature-A.

The problem

So far, this is not super-interesting, but we should note how the commits got chained together, one at a time, built on previous commits:

          1   <-- feature-A (HEAD)
         /
...--o--o   <-- master

became:

          1--2   <-- feature-A (HEAD)
         /
...--o--o   <-- master

which eventually became:

          1--2--3--4--5   <-- feature-A (HEAD)
         /
...--o--o   <-- master

You then made a pull request: "please obtain these new commits 1-2-3-4-5 and do something to incorporate them." Whoever is your upstream eventually did obtain those commits and incorporate them, but—here is the problem—they did so using GitHub's "squash and merge" feature button, which internally runs git merge --squash, which doesn't incorporate those commits at all.

Instead, what git merge --squash does is to use Git's merge machinery to do the "merge as a verb" process of combining changes, but then make a totally new commit. In their upstream they may have already added some other new commits, so that by the time they brought in your commits 1-2-3-4-5 they had:

          1--2--3--4--5   [imported - no name]
         /
...--o--*--A--B   <-- master

They had their Git (and GitHub) combine the changes from commit * (the merge base) to B, i.e., what they did, with the changes from * to 5, i.e., what you did, and make a new commit C from the result. Because this is a --squash operation, the new commit does not record its second parent, leaving the graph to look like this:

          1--2--3--4--5
         /
...--o--*--A--B---------C   <-- master

when you might wish it looked like this instead:

          1--2--3--4--5
         /             \
...--o--*--A--B---------C   <-- master

It doesn't have the extra linkage, though, so now you must deal with this.

Meanwhile, you made more commits

You went ahead and created a feature-B branch in your own repository:

          1--2--3--4--5   <-- feature-A, feature-B (HEAD)
         /
...--o--o   <-- master

You now made a few more commits:

                        6  <-- feature-B (HEAD)
                       /
          1--2--3--4--5   <-- feature-A
         /
...--o--o   <-- master

eventually resulting in:

                        6--7--8  <-- feature-B (HEAD)
                       /
          1--2--3--4--5   <-- feature-A
         /
...--o--o   <-- master

At some point you may even have obtained their commits A-B-C from your upstream (their Git repository). If you have not done that yet, you should do it now:

                        6--7--8  <-- feature-B (HEAD)
                       /
          1--2--3--4--5   <-- feature-A
         /
...--o--o   <-- master
         \
          A--B--C   <-- upstream/master

Note that their commit C is roughly the equivalent of adding up your commits 1-2-3-4-5, except that the parent of C is B, not the commit that was (and in this drawing still is) the tip of your master.

What you would now like to do is make a copy of the commit chain 6-7-8, except that you want to base these copies on commit C, not commit 5. That is, the result you want looks like this:

                        6--7--8  [old feature-B, to be abandoned]
                       /
          1--2--3--4--5   <-- feature-A
         /
...--o--o   <-- master
         \
          A--B--C   <-- upstream/master
                 \
                  C6-C7-C8   <-- feature-B

The Git command that copies commits en-masse, and in the process makes the copies have a new base, is git rebase. But if you just run:

git checkout feature-B && git rebase upstream/master

Git will select for copying those commits that are reachable from the name feature-B but not from the name upstream/master. The word reachable here means that if we start at the tip commit, and work backwards the way Git does, which commits will we encounter? We'll start with commit 8, then reach (via its parent hash) commit 7, then 6, and so on down the chain towards the left. Eventually we'll reach the tip commit of your master, and continue to the left. But if we start from upstream/master and work backwards, we'll reach the tip commit of your master and continue to the left—so those commits are the ones that are not copied. That leaves commits 1-2-3-4-5-6-7-8 to be copied.

Again, that's the problem: there are too many commits here. We want to stop from commit 5 earlier, so that we copy only the 6-7-8 chain. This is where we use git rebase --onto instead of just git rebase.

Using `--onto`

When git rebase does its job, it has to pick out two things, not just one:

Which commits should we copy? More precisely, which commits shouldn't we copy? We'll copy some commits up to the current commit, but what's the limit? What don't we copy?
Where should we put the copies?

Normally we just say git rebase upstream/master and it figures out both of these from the one name. The copies go after the named commit, and the commits we copy are those we can't get to from the named commit.

With git rebase --onto upstream/master, we tell Git explicitly: Put the copies after the tip commit of upstream/master. That leaves the other argument to specify the limit: Don't copy. We want to tell Git: Don't copy commit 5 or anything earlier. So we need to find the hash ID of commit 5, or something that works to locate commit 5.

The branch name feature/A points to commit 5. Look at the graph we drew above: there it is! Or, run git log --all --decorate --online --graph and look at the graph Git will draw. Is there a name for the commit that ends the chain that Git shouldn't copy? If so, you can use that name. If not, you can just type in the raw hash ID.

In our case, as long as none of the names have changed the commits to which they point, we can just run:

git checkout feature-B
git rebase --onto upstream/master feature-A

This tells Git to check out (get onto the tip commit of, and record the name) feature-B; then, ending at the current commit, copy some commits, putting the copies after the commit to which upstream/master points. The copies end with the current commit, and start with whatever is left after removing commits ending with feature-A.

That, of course, is commits 6-7-8. So Git will git checkout --detach upstream/master, making HEAD point directly (without a branch name) to the commit:

                        6--7--8  <-- feature-B
                       /
          1--2--3--4--5   <-- feature-A
         /
...--o--o   <-- master
         \
          A--B--C   <-- upstream/master, HEAD

Then Git will copy commit 6 as if by doing git cherry-pick on its hash ID:

                        6--7--8  <-- feature-B
                       /
          1--2--3--4--5   <-- feature-A
         /
...--o--o   <-- master
         \
          A--B--C   <-- upstream/master
                 \
                  C6   <-- HEAD

If that goes well, Git will cherry-pick commit 7:

                        6--7--8  <-- feature-B
                       /
          1--2--3--4--5   <-- feature-A
         /
...--o--o   <-- master
         \
          A--B--C   <-- upstream/master
                 \
                  C6-C7   <-- HEAD

then repeat for 8 which becomes C8; and finally, Git will tear the label feature-B off the old chain 6-7-8 and paste it onto the end of the new copies C6-C7-C8 instead:

                        6--7--8  [abandoned]
                       /
          1--2--3--4--5   <-- feature-A
         /
...--o--o   <-- master
         \
          A--B--C   <-- upstream/master
                 \
                  C6-C7-C8   <-- feature-B (HEAD)

Having the label feature-B pointing to C8, Git will re-attach HEAD to that label, and the rebase is now complete and you can make a pull request that will ask your upstream people to incorporate commits C6-C7-C8 into their repository.

An incredibly thorough answer. I appreciate your time and effort in crafting it. Thank you! — Adam Terlson, Dec 24 '17 at 00:11

score 0 · Answer 2 · answered Nov 10 '20 at 09:48

I just had the same problem.

This is how I successfully dealt with it:

Before the "Feature A" PR was merged, I set the branch of that PR as the base branch of the "Feature B" PR, to hide the commits of "Feature A" from that new PR, and to make a smaller diff for my team-mates to review.
After the "Feature A" PR was merged, I re-set master as the base branch of the "Feature B" PR => GitHub warns about conflicts with master (which you'll also get if you try to $ git merge master or $ git rebase master)
To update the branch while keeping the changes of "Feature B", I run the following commands – assuming that your PR's branch is called feature_b:

$ git checkout feature_b
$ git merge -s ours master
$ git push origin feature_b

=> No more conflict. The PR has the same diff as in the end of step 1, and it's ready to be merged to master.