1

I am working with a relatively big main branch. I create feature branches and merge them into the main branch. Most of such feature branches are orphan branches. Sometimes, I merge one upper-level main branch into a lower-level main branch (PROD to UAT) to align them together.

When doing git pull sometimes I forget to use the--rebase option, and I end up seeing annoying merge commits. I want to remove them.

Please see the attached snapshot for this command TODO:

git rebase -i -rebase-merges HEAD~4

I have the following questions:

  • Why did when I used this command instead git rebase -i -rebase-merges HEAD~3, the TODO file became much smaller and only 3 or 4 items showed up in the file?
  • Not sure what I did exactly, when I tried to squash a normal commit which was just before a merge commit, the merge commit disappeared and the feature branch that was merged also was removed from the history. Can you explain what happened? I remember the process stopped, I had to resolve some conflicts, and do a commit, then continue the rebase.
  • How I can squash the indicated merge commits into one normal commit or merge commit?
  • How to squash the indicated merge commits into a previous normal commit?
  • I want to understand more about the lines starting with reset and label. Can you give some details or point me to a link to do more reading?

I appreciate your help.

enter image description here

tarekahf
  • 738
  • 1
  • 16
  • 42

3 Answers3

1

To set --rebase by default for git pull:

git config --global pull.rebase true

To view your history: use git log --graph --oneline

You can compare :

git log --graph --oneline HEAD~3..HEAD
# and
git log --graph --oneline HEAD~4..HEAD

(those are the commits selected by both your git rebase commands)

LeGEC
  • 46,477
  • 5
  • 57
  • 104
1

Part 1 (link to part 2 here)

You cannot squash a merge commit. There's a simple reason for this: a merge commit is, by definition, a commit with two parents. Squashing two commits together means that you'd like to replace two ordinary (i.e., single-parent) commits with one ordinary commit. As a merge commit is not an ordinary commit, it's simply not eligible for squashing.

As a rather special case, it is possible—but usually a bad idea—to squash an ordinary commit into a merge commit. This produces what is known as an evil merge: see Evil merges in git?

Why did when I used this command instead git rebase -i -rebase-merges HEAD~3, the TODO file became much smaller and only 3 or 4 items showed up in the file?

Rebase is about copying commits to new and improved (or at least, presumably-improved) commits. Having copied those commits, you then direct your Git software to stop using the originals and start using the copies instead.

Only ordinary commits can be copied this way. However, there is a new-ish --rebase-merges option (side note: the double hyphen here is required; you can use -r as a synonym to avoid having to type a double hyphen), first available in Git 2.22, with numerous fixes and improvements in later versions. This tells Git to re-perform the specified merges. To get rid of merges, you want to avoid performing them at all. This requires detailed understanding of how rebase works.

The argument you're using here, HEAD~3 or HEAD~4, specifies which commits not to copy. Without this information Git would have to assume that you mean to copy every reachable commit (git rev-list --count HEAD would tell you how many commits that is, but it's probably hundreds or thousands). This argument is required unless you use --root to tell Git that it should copy every reachable commit (usually a bad idea, which is why rebase --root was not in Git for many years, until 1.7.12 was released in 2012).

Understanding commits

Getting all of this stuff into your head is a pretty big commitment (if I may use the word commit here). Still, it's important to do it. Remember that a repository itself is primarily a big box full of commits, so it's very important to know what each commit is and how they work, individually and together. More precisely, the two fundamental components of any Git repository are an objects database, which holds commit objects and other supporting objects, and a names database, which holds branch and tag and other names.

We start here with the objects, most specifically the commits. (The other three object types—tree, blob, and annotated-tag—are not in your face the way commits are: they mostly just work and you don't have to know the details, the way you do for commits). Every Git object is numbered, with a big ugly random-looking hash ID, or more formally an object ID or OID. These objects IDs can often be abbreviated, e.g., to the 4d3abd9 that appears in your image, but each one is actually 40 characters long (20 bytes or 160 bits), at least today. (Future versions of Git will someday have 256-bit = 64 character OIDs.)

For commits in particular, each commit gets a unique hash ID, at the time you (or whoever) make the commit. That one hash ID is now reserved forever, in every Git repository, even ones that do not yet exist, to mean that particular commit. This literally cannot work forever, and someday Git will break, but the sheer size of the hash ID is intended to put that day so far into the future (billions of years or more) that we don't care about this. To make this work as well as it does—which in practice, is just fine, even though it's theoretically rubbish—no part of any object can ever be changed. The trick by which Git ensures that other Gits that have never seen your new commit yet, and have no communications link to your repository, have already reserved that hash ID, is to use a cryptographic checksum of the commit content, and that trick only works if the content cannot be changed.1 That's currently the SHA-1 hash.

So: a commit is a numbered entity, found in Git's objects database—a simple key-value store with a hash ID as the key—by looking up its hash ID. Git desperately needs the hash ID to find the commit. Without that hash ID, Git is helpless. That's why you keep seeing them.

But what's in a commit? What good is a commit? The answer is simple, and two-fold:

  • A commit stores (indirectly) a full snapshot of every file. More precisely, it stores a full snapshot of every file Git was told to store, at the time you, or whoever, made the commit. As with all objects, this snapshot is completely read-only. For multiple purposes, the files are stored in a special form in which their content is compressed and de-duplicated. The de-duplication takes care of the obvious objection, that if every commit stores every file every time, the repository will become hugely bloated. As long as most commits mostly re-use most of the files from one or more previous commits, those files take literally no space at all, because they're de-duplicated away.

  • A commit also stores (directly) some metadata, or information about the commit itself. This is the only part of a commit that must occupy some space, and it's pretty tiny: it holds the name and email address of the person who made the commit, for instance, and some date-and-time-stamps and the log message. If your log message is not crazy long, the uncompressed commit is probably no more than a few hundred bytes (and then it gets compressed too).

Crucially for Git itself, the metadata for any given commit also stores a list of hash IDs. This list is usually just one entry long, which makes this commit an ordinary commit, with a single parent. The hash ID stored in the commit, in its metadata, is the hash ID of the parent of this commit, i.e., the commit that comes just before this one.

At least one commit in any non-empty repository is special because it has no parent: it's the first commit and it therefore is a root commit instead of an ordinary commit. It still has a full snapshot, just like any commit. It's possible to make extra root commits, using git checkout --orphan or git switch --orphan, but that's usually a bad idea. (You mentioned that you are using "orphan branches", and that's usually a bad idea, as we'll see.)

Some commits have more than one parent, which makes them merge commits. A merge commit still has just the single snapshot, like any other commit. Most merge commits have exactly two parents—some version control systems require this (e.g., Mercurial) but since Git has a list of parents, Git allows any integer ≥ 2 here. A multi-parent merge does not do anything that a two-parent merge couldn't do—in fact, it's kind of the reverse: a two-parent merge can do things that a 3+-parent merge can't. So they can be used for tying together multiple features. In my own opinion, though, they're mostly for showing off.


1The observant student, or anyone who knows Mercurial, can immediately note that it works if there's some unchanging portion of the content: if we have some changeable part that's not included in the checksum, we're fine. Git doesn't bother implementing this though, on the theory that any such content could be put somewhere else (which is always true but not always efficient, but as Kilgore Trout might say, so it goes).


Relating a commit to an earlier commit: links in a chain

Let's pause now and draw an ordinary commit. We don't know (nor want to know) its actual hash ID; let's just call it H for "hash":

            <-H

What's this little arrow sticking out of H? It's to represent the hash ID stored in the commit's metadata. This hash ID allows Git to retrieve H's parent from the objects database. Let's draw in the parent:

        <-G <-H

We say that H points to its parent G. But there's an arrow coming out of G too, because like H, G is an ordinary commit, so G points backwards to its parent:

... <-F <-G <-H

This goes on forever—or rather, until we reach a root commit (probably the root commit), which lacks the arrow. That's where we, and Git, get to stop and rest.

There is one problem here, which is that we have to tell Git the hash ID of commit H. Nobody wants to memorize hash IDs, and type them in, so there's one more arrow, and it's one I'll almost always draw. I get lazy about the ones from commit back to parent:

...--F--G--H   <--

I'll show what's on the right of that final arrow in a moment. For now, let's observe: Naming commit H implicitly names every commit leading up to and including commit H. More precisely, Git has two ways to look up a commit: "with history", which means drag in previous commits too, or "without history", which means we should follow some arrow once to find one commit. When we use the "with history" variety of looking things up, this drags in lots of commits, by following arrows backwards until we either run out of commits (by reaching the root) or by hitting a place we're told to stop.

Branch names find the last commit: how branches grow

Now we fill in the part to the right of that last arrow:

...--G--H   <-- main

What goes on the right is the branch name, or in many cases, branch names and other names too. We can have more than one name that selects commit H like this:

...--G--H   <-- feature, main

Whenever a branch name points to any one particular commit—as in this case, where both names feature and main point to H—we say that this particular commit is the tip commit of that branch. So commit H is the tip of both branches, feature and main.

When we want to use Git, we'll check out (or git switch to) one of these branch names. We can only use one branch name at a time,2 for reasons I won't cover here as this will already be too long, but we'll draw that by attaching the special name HEAD and attaching it to one of the branch names:

...--G--H   <-- feature, main (HEAD)

Here, we are using commit H because we're selecting one commit, without history, through the name main. If we run git switch feature or git checkout feature—these do the same thing—we get:

...--G--H   <-- feature (HEAD), main

Now we're using commit H—i.e., we haven't changed anything at all about the commit we're using—but we're doing so through the name feature.

If we now make one new commit, our new commit will:

  • save a full snapshot of every file (de-duplicated) as those files appear in Git's index / staging area (which we won't cover here); and
  • have Git put our commit message and other metadata together to make a new commit, that gets a new, unique, never-before-used, never-to-be-used-again hash ID that we'll just call I.

This new commit I will necessarily have commit H as its parent, because that's the commit we're using when we make new commit I. So I will point back to H:

...--G--H
         \
          I

But what about the branch names and their arrows? Well, the name main contains the raw hash ID of existing commit H, and that has not changed. But we're on branch feature, so the last step of git commit is to write the new hash ID—the raw hash ID for commit Iinto the name feature, so that feature now points to I instead of H:

...--G--H   <-- main
         \
          I   <-- feature (HEAD)

The special name HEAD is still attached to feature, but feature now points to I as its tip commit. Commits up through H are still on both branches, but commit I is currently only on feature.


2Since Git 2.5, you can use git worktree add to add extra working trees: each one can be on one branch. For various reasons, each added work-tree must be on a different branch from all other work-trees, but this is a very good way to deal with multiple branches when, for instance, you need to fix a high-priority bug without also losing momentum with some feature you're in the middle of.


Commits cannot change but branch names can and do

Note how the only thing we've done to the commits here is to add new ones. That's literally all we can do: we cannot remove existing commits, and we cannot change existing commits. If there is something we don't like about some commit, all we can do is add another commit. That's the nature of commits: that's how Git can work at all. Commits never change; once you give one to some other Git repository, that repository has that commit, under that hash ID that all Git software in the universe agrees is reserved to that commit, and now they have it too.

Branch names, however, can and do move all the time. A branch name, which is just a special kind of name stored in the repository's names database—a branch name is just a string prefixed with refs/heads/, which is what makes it a branch name—stores exactly one hash ID, as do all names in this names database. Whatever hash ID is in that branch name is the last commit on that branch. That's a literal definition in Git.

So, if we have:

...--G--H   <-- main
         \
          I   <-- feature (HEAD)

in our repository, and we force Git to move the name main to point to G instead of H, we get:

...--G   <-- main
      \
       H--I   <-- feature (HEAD)

Now commits up through and including G are on both branches. Commit G is the tip commit of main. Commits H-I are on branch feature. Nothing has changed in the commits themselves. The branch names literally don't matter, except that we use them to find the last commit of each branch.

Should we move the name main forward instead of backwards, we get:

...--G--H--I   <-- feature (HEAD), main

and now all commits are on both branches. The set of branches that contain any given commit changes over time. The commits themselves do not change, but the names by which we can find the commits do change.

If we like, we could force the name feature back one hop, leaving main at H:

...--G--H   <-- feature (HEAD), main
         \
          I   ???

Commit I remains in the repository, but unless you have memorized its hash ID—or can find it somehow—you can't get to it any more. Remember that Git needs the hash ID to find something in its objects database. There's no forwards arrow from H to I, only a backwards one from I to H, and a backwards one from H to G and so on. All Git operations require that we know where we end; Git works backwards, so we start with the end. (The last shall be first, perhaps? )

But we can copy commits to new-and-improved commits

Suppose we have this as before:

...--G--H   <-- main
         \
          I   <-- feature (HEAD)

and we discover something bad about commit I. Maybe a file has a typo. Maybe the commit message has a typo. Maybe both are true. Whatever is the case, we fix anything about any files and git add them so that we can redo the commit, then we run:

git commit --amend

This does not change existing commit I: it literally can't. Instead, it makes a new and improved commit that's a lot like I—maybe it has the same snapshot, if there wasn't anything wrong with the files—with slightly different metadata (maybe we've fixed a typo in the commit message; in any case, the committer time stamp is "now" and time has moved on since we made I). So we get our new commit I', which is a lot like I, and Git shoves I''s hash ID into the name feature. What --amend does is make git commit store, as I''s parent, the parent that I has, i.e., the hash ID of commit H, instead of using I itself as the parent. So now we have this:

          I'  <-- feature (HEAD)
         /
...--G--H   <-- main
         \
          I   ???

Commit I seems to have vanished. If we aren't vigilant about commit hash IDs, commit I' seems to be commit I, as if commit I had changed. But it hasn't: it's still right there, just as it was before. We just can't find it, and that's what we want: commit I is no longer useful; I' is the new and improved I.

Branches and merges

Suppose we start out with a repository with a few commits, ending in H, and just one branch name main:

...--G--H   <-- main (HEAD)

We now create two new branch names, br1 and br2, both also pointing to commit H:

...--G--H   <-- br1, br2, main (HEAD)

We switch to br1 and create two new commits in the usual way:

          I--J   <-- br1 (HEAD)
         /
...--G--H   <-- br2, main

Then we switch to br2 and make two more commits:

          I--J   <-- br1
         /
...--G--H   <-- main
         \
          K--L   <-- br2 (HEAD)

We're now ready to use git merge to merge commits J and L. Note that we don't merge branches in any real technical sense: we merge commits. The purpose of this merge is to combine work. We want the work we did "on" br1, and the work we did "on" br2. We'll git switch br1 and then git merge br2 for instance.

We'll skip right ahead here to the mechanism, to save space in this answer: The way Git achieves a merge is to diff (as in git diff) the merge base commit against each of the two branch tip commits, so that Git can figure out what work was done. The obvious common starting point for the two branches is commit H. It has a name (main) pointing to it, but that doesn't matter here, and in fact that name is kind of in the way so I'm going to stop drawing it in. So Git runs, in effect:

git diff --find-renames <hash-of-H> <hash-of-J>   # what we changed
git diff --find-renames <hash-of-H> <hash-of-L>   # what they changed

Git then combines the diffs: what we did to some file, we want to do again; what they did to some other file, we want to do again; if we both touched the same file, we want to make both changes. We make these changes to the snapshot taken from commit H, which is the one on the left side of both git diff commands.

This produces a new snapshot, ready to be committed, or perhaps produces some merge conflicts that make Git stop and make us fix them up before the merge can be finished. Assuming all goes well and Git does finish on its own, though, Git now makes a merge commit, which is nothing more than a commit with two parents:

          I--J
         /    \
...--G--H      M   <-- br1 (HEAD)
         \    /
          K--L   <-- br2

Note how merge commit M points backwards to both J and L. As usual when making a new commit, Git has stuffed the new commit's hash ID into the current branch name—the one with HEAD attached—so now br1 points to commit M. It's now safe for us to delete the name br2, if we wish, as Git finds commits by walking all paths backwards, so when we select commit M with history, we get all the commits.

This is the essence of a true merge. Git makes the content by combining work and applying the combined work to the merge base. Git finds the merge base by working backwards from the two tip commits (along all paths!—but in our case it was easy as there was just the simple straight-line backwards path) until it finds some shared commit(s). Technically, Git is using the Lowest Common Ancestor algorithm here to find commit H, but for this case it's obvious by eyeball.

Having constructed the content, all on its own if possible, Git now makes the new snapshot and makes the metadata, with the list of two parents showing the then-current commit J first, and the other commit L second. The order here usually doesn't matter, but when it does, --first-parent lets you direct Git to ignore all but the first parent.

Copying commits via git cherry-pick

Something that happens relatively often while working in Git is that we discover a bug that affects us on our feature branch, but also affects the more-main-ish branch we started from. That is, perhaps we have this:

...--X--o--*--o--P   <-- main
            \
             F--G--H   <-- feature (HEAD)

We discover a bug now and realize that it was made back in commit X, so the bug is present on both main and feature branches.

There are a lot of strategies for fixing it (including some that don't involve cherry-pick), but it's pretty common for people to now fix the bug as an emergency on the main branch. They add a new commit C that fixes the bug:

...--X--o--*--o--P--C   <-- main
            \
             F--G--H   <-- feature (HEAD)

We'd like the fix too, so we want to add a new commit that's a lot like C. We could just duplicate their work and make a new commit I, but wouldn't it be nice if we could literally copy the change from C? And we can:

git cherry-pick main

tells Git to look up commit C (as found by the name main), go back one hop to its parent P, diff the two snapshots to figure out what changed, and apply the same change to our commit H on our branch.

Technically, Git does this using the same merge code as for git merge, except that instead of the obvious merge base (commit *), it forces the merge base to be commit P, the parent of commit C, the commit we're copying. The reason for this is that it produces the right answer (and if you think about it, or work through all the details, you'll see that it does produce the right answer), but for now we'll just think of it as "duplicate that work" and assume it works. This gives us a new commit C', which we call C' because it's so much like C. Git even copies the commit message for us (though we can change it by adding --edit):

...--X--o--*--o--P--C   <-- main
            \
             F--G--H--C'  <-- feature (HEAD)

All we really need to remember here is that cherry-pick effectively copies a commit, to a new-and-improved commit. The improvement in this case is that the new commit goes on our feature branch, and uses commit H's snapshot as modified by whatever changed from P to C.

Simple rebase cases

We're now (finally!) ready to consider a simple git rebase. Suppose we have finished our feature, including the cherry-pick:

...--X--o--*--o--P--C   <-- main
            \
             F--G--H--C'--I--J   <-- feature (HEAD)

We have but one regret, at this point: we wish we'd started our feature at commit C, so that we get all the goodies from the unnamed o after *, and P, and C itself. So we run:

git rebase main

The argument here, main, selects commit C with history, so it means every commit from C on backwards. That doesn't include any of our commits after commit *, because the arrows between commits all point backwards, and commit F, our first part of our feature, is forwards from *.

The commits selected here are temporarily "painted red", if you will: red means stop, do not touch. Now Git starts adding some temporary green "paint" to our own commits, working backwards from the tip of feature. This lists out the hash IDs for J, I, C', H, G, and F. Upon moving back from *, Git hits the "red paint" and stops moving back.

Git now cleans off all the temporary paint (this "paint" is really just some bits in some in-core data obtained by reading the commits from the objects database—it's just a way to mentally model how rebase works) and reverses the list of hash IDs, so that they're in the order Git will need when copying: F, G, ..., J.

There's now a special trick that rebase uses: it looks at the commits from C backwards to * too, to see if any of them "do the same thing" as any of the commits in the list above. This usually manages to pick out C and C' as copies of each other. Using this trick, Git discards C' from our own list. The list now goes F-G-H-I-J.

Now, using the name main to select a commit without history, Git switches to that commit. That's commit C, of course. Git uses what Git calls detached HEAD mode for this operation, but if you prefer, you can think of this as using a temporary branch. I'll just draw it with the literal detached HEAD here though:

...--X--o--*--o--P--C   <-- main, HEAD
            \
             F--G--H--C'--I--J   <-- feature

Git now begins using cherry-pick, one commit at a time. First it copies F to a new-and-improved F'. The improvement is that we start with the snapshot from C, and our new parent, when Git makes the new commit by cherry-picking, is C. So our new commit goes on like this:

                      F'  <-- HEAD
                     /
...--X--o--*--o--P--C   <-- main
            \
             F--G--H--C'--I--J   <-- feature

Once F' exists, Git copies G, using git cherry-pick again. This produces a new G':

                      F'-G'  <-- HEAD
                     /
...--X--o--*--o--P--C   <-- main
            \
             F--G--H--C'--I--J   <-- feature

We repeat with every commit except C', which got knocked out of the list:

                      F'-G'-H'-I'-J'  <-- HEAD
                     /
...--X--o--*--o--P--C   <-- main
            \
             F--G--H--C'--I--J   <-- feature

The copying phase is complete, so Git now forces the name feature to move to where we are now—commit J'—and then re-attaches HEAD, so that we're back in the normal working mode:

                      F'-G'-H'-I'-J'  <-- feature (HEAD)
                     /
...--X--o--*--o--P--C   <-- main
            \
             F--G--H--C'--I--J   [abandoned]

Interactive rebase

The interactive mode of git rebase turns each git cherry-pick operation into a separate pick command. This allows you to shuffle the order of the commits around and/or to make some of them read fixup or squash, for instance. On encountering one of the latter, Git effectively uses git commit --amend to make it look like Git combined two commits during the rebase.3

Besides this particular feature—of letting you rearrange and reorganize commits (and with a fair bit of extra work, even "split" a commit into two or more separate commits), this interactive rebase also offers the new --rebase-merges mode. This does something fairly radical.


3This means that you get a lot of pointless temporary commits: git fsck will find them and tell you about the dangling commits, and that's perfectly normal. They don't get git push-ed and your own Git will eventually, probably, delete them, after a suitable interval to make sure that you don't mind. I'm not going to cover this for space reasons, but git gc is what cleans up here. The newfangled git maintenance series of commands are meant to subsume git gc eventually. Noteworthy: GitHub do not use git gc: "dangling" leftovers in a GitHub repository remain there forever, for internal implementation reasons, unless you get GitHub support to do it for you.

Part 2

torek
  • 448,244
  • 59
  • 642
  • 775
1

Part 2 (link to part 1 here)

Rebasing with merges, with or without --rebase-merges

Suppose that instead of a simple:

...--o--*--P  <-- main
         \
          F--G--H   <-- feature (HEAD)

style branch setup at the start, we have this:

...--o--*--P  <-- main
         \
          \         I--J
           \       /    \
            F--G--H      M--N   <-- feature (HEAD)
                   \    /
                    K--L

That is, we have some branch-and-merge kind of thing happening in our feature branch. We now want to rebase feature onto commit P, as if we started our work there instead of at commit *.

A plain git rebase main will:

  • paint P and * red; and
  • paint N and M and J-and-L and I-and-K and H and G and F green

and those would be our commits-to-be-copied. But regular rebase drops merge commits like M entirely, on purpose. The reason for this is that git cherry-pick literally cannot copy a merge commit like M. This "drop merges" happens in the same way as "drop commits that are copies" (except that it's much easier internally). We end up with:

             F'-G'-H'-I'-J'-K'-L'-N'   <-- feature (HEAD)
            /
...--o--*--P  <-- main
         \
          \         I--J
           \       /    \
            F--G--H      M--N   [abandoned]
                   \    /
                    K--L

or, perhaps, the I-J and K-L order get switched around so that we have F'-G'-H'-K'-L'-I'-J'-N'.

The final snapshot in N' is just as good as the final snapshot in N was and the reason for this is that the merge commit M simply combined the work of I-J and K-L. When we used cherry-pick to copy each commit, one at a time, and flatten the merge away, we still got the same changes. The fact that M is not an evil merge means that it was safe to omit it.

This obviously breaks if M is an evil merge. So perhaps evil merges are bad! That's why I said above that it's possible but usually a bad idea to squash an ordinary commit into a merge. The result is an evil merge and it gets dropped by this kind of git rebase.

What about --rebase-merges? Well, in this case, you get a much more complicated TODO worksheet:

I want to understand more about the lines starting with reset and label. Can you give some details or point me to a link to do more reading?

What git rebase needs to do here is re-create merge commit M. The cherry-pick command still can't copy it, so instead of copying it, Git has to re-perform the merge.

The instruction sheet will now say:

  • copy commits F, G, and H: three pick commands
  • save the hash ID of commit H': one label
  • copy commits I and J now, and save the hash ID of commit J', or copy K and L now and save L': one more label
  • switch back to commit H using the first label
  • copy the other two commits that we didn't copy yet
  • some more stuff, but let's pause at this point to draw what we have.

At this point in our copying-commits-to-new-and-improved-commits process, we have this:

                       I'-J'  [labeled and possibly HEAD]
                      /
               F'-G'-H'  [labeled]
              /       \
             /         K'-L'  [labeled and possibly HEAD]
            /
...--o--*--P  <-- main
         \
          \         I--J
           \       /    \
            F--G--H      M--N   <-- feature
                   \    /
                    K--L

We had to label commit H' so that we could reset to it. We had to label at least one of commits J' and L' so that we could find it again—whichever isn't HEAD. It's easier to just label both, and there's a reason to do that anyway, as we're about to see.

We're now ready to make M', our "copy" of M. We can't use cherry-pick at all; we have to run git merge. We'd like it to re-use the commit message from commit M, so we get a merge -C <hash> <label>, where the hash ID is that of the original merge, and the label is the commit we want merged in. Since the order of the two commits matters, we may need start with a reset to make sure HEAD selects commit J' now, if we're on L'. If we do, we need a label for J' to reset to. We don't know what the hash ID of J' will be at the time we generate the TODO list.

Anyway, we now reset to the appropriate first commit, and git merge the second one, to produce M'. Then we're ready to cherry-pick N' and that's the last operation before the rebase can finish on its own by moving the name feature:

                       I'-J'
                      /    \
               F'-G'-H'     M'-N'  <-- feature (HEAD)
              /       \    /
             /         K'-L'
            /
...--o--*--P  <-- main
         \
          \         I--J
           \       /    \
            F--G--H      M--N   [abandoned]
                   \    /
                    K--L

Note that the git merge that builds commit M does a new merge. Again, this is safe because—or more precisely, if—commit M is not an evil merge. So --rebase-merges doesn't save us from the error of making an evil merge.

Getting back to your first question

We can now finally also answer this:

Why did when I used this command instead git rebase -i --rebase-merges HEAD~3, the TODO file became much smaller and only 3 or 4 items showed up in the file?

The red-and-green-paint trick determines which commits are to be copied. This doesn't depend on the --rebase-merges flag, which—as we saw above—just selects whether Git should re-perform merges, or flatten them away.

It really sounds like you want to flatten away your merges. To do that, don't use --rebase-merges. However, note that the red/green-paint thing can get tricky. In particular, git rebase only lets you pick one commit to select-with-history, to apply "red paint" from there on back.

You will always get the "green paint" applied to whichever commit is HEAD when you run git rebase, plus other commits selected because of the "with history" style selection.4 Remember that the commits are the history, and a git merge that you did in the past becomes a branch when we scan backwards, the way Git does.


4Caveat: if you run git rebase --onto X Y Z or git rebase Y Z, the operation start with a git switch Z, and that's the commit that HEAD selects. This is exactly equivalent to running git switch Z first, and then running git rebase --onto X Y or git rebase Y, including the fact that when the rebase completes, you're "on" branch Z.

torek
  • 448,244
  • 59
  • 642
  • 775
  • Thanks a lot. I want to give bounties for this how? In the meantime, I have some questions to be able to understand a lot of what you have written. I am going to post the questions here and answer them when you can. – tarekahf Aug 24 '22 at 14:24