Why git merge does not create a common ancestor?

Question

This is not a question to solve a problem, just to broaden my understanding of git.

I have a project that is basically a one-man effort, so it has just one branch on our gitlab server; on my development machine, I usually have a "generic" dev branch, and a quickfix branch when needed. I usually delete the quickfix branch after merging it on master, but I keep the dev branch active for longer periods, so there is a number of commits from dev to master as the project progresses.

When I merge dev on master, I usually use --squash to get rid of the irrelevant "development" commits; then the merge message I am proposed says, "squashed commit of the following", with a list of commits since the common ancestor, that is, the commit I created dev from, rather than the previous commit I merged. Of course I simply delete the default comment and write my own; yet, I am surprised that git does not realize that a merge from dev to master creates a new common ancestor for the two branches.

Again, not a problem: I can simply delete the dev branch and recreate it–actually, that's what I used to do before realizing that it was not necessary. Yet, I would like to understand why git does not consider a merge as a common ancestor for the branches. Maybe I am missing an option?

EDIT:

My merge strategy does the following:

Is there anyway to get the following, without deleting and recreating the dev branch?

I am not sure I understand the question.... but just in case: when you decide to squash, git is actually not saving that revision as _a merge_. That revision will have a single parent and only the comment will say anything about a merge. — eftshift0, May 21 '21 at 16:08
What does this mean? `a merge from dev to master creates a new common ancestor for the two branches`. When you merge two branches, a common ancestor is calculated by git in order to do the merge of the contents (actually, it's trickier when using recursive strategy but alas, let's use kiss for the explanation), _however_ git does _not_ save the revision that it used as the common ancestor in any way in the metadata of the resulting merge revision (or in a squashed merge revision, for that matter). — eftshift0, May 21 '21 at 16:12
Thank you both, I added a couple of diagrams to clarify what I meant. I guess I had not thought enough what I was expecting... — Francesco Marchetti-Stasi, May 21 '21 at 16:55
Squash "merge" is _not a merge_. So it doesn't move the merge base. — matt, May 22 '21 at 03:43
"Is there anyway to get the following, without deleting and recreating the dev branch?" Yes. It's called merge! Not squash merge. Merge. — matt, May 22 '21 at 03:45
@matt, I understand it, now. Anyway, I don't want to see my "meaningless" commits from the local feature branches on the server, so I chose to keep on using squash and delete the feature branches. — Francesco Marchetti-Stasi, May 24 '21 at 09:07

torek · Accepted Answer · 2021-05-24T08:57:10.863

TL;DR

What you want is "features as merge bubbles". To get these, use git merge --no-ff from your mainline with each feature. You should generally put each feature on its own branch, but if you like, you can use the same name (e.g., dev) each time. The branch names don't really matter and Git generally does not store them (you can get them into commit messages if you like, but messages of the form merge branch blergh have no real value).

Long

The root of the answer is that git merge --squash does not make a merge (commit).

The word merge in Git is used both as a verb, to merge, meaning to combine two different sets of changes, and as an adjective modifying the word commit: a merge commit is a commit with two or more parents.¹ The adjective form, merge commit, is often shortened to a simple noun, a merge. So we need to keep in mind that some Git commands perform merge-type actions, i.e., do merge-as-a-verb, and some Git commands produce merge-type commits, i.e., make a merge, a noun.

The git merge command often but not always does both. Sometimes it does just one of the two—the merge action, without the merge commit at the end—and sometimes it does neither.

The git cherry-pick and git revert commands always² do the merge-as-a-verb part but never make a merge in the end.

The git commit command can make an ordinary commit, or in some special cases, a merge commit or a root commit: a commit with no parents at all.

To understand how all these parts interact, we need to remember a few more things:

Git actually builds new commits from what is in Git's index.
The index gets expanded during a merge-as-a-verb operation. Now, instead of holding one copy of each file, it holds three.³
If Git stops in the middle of a conflicted merge, it leaves various trace files, such as MERGE_HEAD, MERGE_MSG, CHERRY_PICK_HEAD, and so on. The git status command knows to look for these and can tell you that you are in the middle of a conflicted merge, for instance, with files as yet unresolved, or with all conflicts resolved.

When you run git command --continue or git commit, Git picks up where it left off. (The --continue variety acts as a sanity check, that there's that particular command to continue at this point.) When you run certain kinds of git reset, or git command --abort or git command --quit, Git terminates the unfinished operation and either puts things back (--abort) or doesn't (--quit) by invoking the right kind of reset (--hard or --soft).

This means that, e.g., git merge --no-commit can start the merge, run it as far as it can on its own—perhaps even to the point that there are no conflicts remaining—and then just stop and let you fiddle with Git's index and/or your working tree as much as you like. Your eventual git merge --continue or git commit will then finish the merge, using the files Git left behind when it stopped, plus any updates you made to the index (a so-called evil merge; see Evil merges in git?). Or, your git reset --hard or git merge --abort erases all the work that git merge did, removes the merge-in-progress marker files, and leaves you set up as if you had not even started a git merge command.⁴

Anyway, if you have gotten through to this part without getting lost, git merge --squash becomes very easy to understand. It:

starts the merge process, like git merge would;
has an implied --no-commit, so that it stops before committing; and/but
it does not create any "merge going on" files, so that the status after stopping is that git merge --continue is not allowed, and git commit will make an ordinary commit, not a merge commit.

Since merges, and future merge bases, are determined by the commit graph—which is to say, the commits themselves including their parent linkages—and git merge --squash does not put in the extra parent linkage, the final commit doesn't have the history you want. The solution, then, is to avoid git merge --squash.

You might (quite legitimately) wonder what git merge --squash is good for. The answer is: not all that much! There's one situation in which it definitely makes sense, though, and that is when you:

create a branch for experimentation;
do your experimenting by writing multiple commits;
at the end of the experimenting, decide that the result is good, but it should just be one commit; and
want to easily make that one commit.

To make that one commit, you go back to the branch from which you created the experimental branch, and run git merge --squash experiment (or whatever name is appropriate here). You then write the desired commit message for the one commit, and then delete the experimental branch. It is now "dead": its commits have no further use. They are all trash, to be hauled away with the rest of the rubbish in a month or so when the garbage collector gets around to it.

If you don't intend to kill the branch, git merge --squash is probably the wrong tool. (But see also matt's comment about using squash-merge with GitHub PRs.)

¹A commit with more than two parents is an octopus merge. These are normally made with git merge -s octopus, but the -s octopus part is implied by giving git merge two or more commit specifiers. They don't do anything you can't do with more typical two-parent merges. In fact, they specifically don't do things—namely, resolve conflicts—that you can do with two-parent merges, which is probably the main justification for having octopus merge in the first place: since an octopus merge is "weaker" than a normal merge, if you see one in a set of commits, you can be pretty sure it was one of these easy, conflict-free merge cases.

Overall, though, I still think octopus merges are mainly just for showing off.

²"Always" here is a little too strong: sometimes git cherry-pick can just error out, for instance, and if the merge-as-a-verb part of the action stops with a merge conflict, you're left in the middle of the operation.

³More precisely, it holds up to three, from the three input commits to a merge operation: the merge base, the "ours" or "local" or HEAD commit, and the "theirs" or "remote" or "other" commit. But if a file is missing from one of the three commits—for instance, if we modified file path/to/file.ext and they removed it entirely—there might be fewer than three index entries for the file.

⁴Note that for this to work, the state that git reset --hard writes—which is to say, the set of files that are in the HEAD commit right now—must match the state that everything had when you first started the git merge. Equivalently, git status would have had to have said nothing to commit, working tree clean (though perhaps with untracked files). That's why git merge normally requires a "clean" state before it is willing to start. The internal git merge-recursive command is not so careful, and it's possible to start a merge with index and/or working tree in states that cannot be recovered by stopping the merge after all, if you run git merge-recursive—as, e.g., git stash apply does.

"what git merge --squash is good for. The answer is: not all that much!" Yes and no. I felt this way until I found myself on a team that uses only squash merges with automatic feature branch deletion, via GitHub. The loss of history, I discovered, was mitigated by the fact that pull requests live forever. So the history wasn't really lost, and the closed pull requests became a valuable resource, a veritable encyclopedia. — matt, May 22 '21 at 22:13
@matt: Interesting; that does seem a useful thing. The PRs keep the "real" history if it's needed, but normally, it stays helpfully out of the way. Merge bubbles achieve the same effect but do require `git log --first-parent` (which has its own issues)... — torek, May 22 '21 at 23:34
Yessss, many thanks for your time, this is *exactly* what I wanted–a thorough explanation of what was going on! I think I understood most of what you said, and the remainder... I'll come back to this explanation :) — Francesco Marchetti-Stasi, May 24 '21 at 08:54
As for my question: I now understand that keeping alive the dev branch after merging, committing and pushing it is confusing, I will delete and recreate it (and maybe give it more meaningful names each time). — Francesco Marchetti-Stasi, May 24 '21 at 09:01

score 1 · Answer 2 · answered May 21 '21 at 17:18

1

Well, the final question (at least, current version of the question) can be achieved like this (assuming some-branch is on A and other-branch is on revision-6):

git checkout some-branch
git merge revision-3 # by using its ID
# now we have created revision B
git checkout other-branch
# let's rebase it
git rebase some-branch # this should set up revision-4, 5 and 6 on top of B
git checkout some-branch
git merge other-branch

And there you have it.

answered May 21 '21 at 17:18

eftshift0

26,375
3
36
60

Thanks for the idea, in the end after understanding what was going on (that was my main need, after all) I decided to give up long-lived branches. – Francesco Marchetti-Stasi May 24 '21 at 09:04
Nothing wrong with long-lived branches, as long as you understand what's going on. Certainly, shorter _straight-line_ feature branches are much simpler to juggle with. – eftshift0 May 24 '21 at 12:43
Well, they were not especially useful for my purpose, so I could give them up. Also, I daresay that they are essential on the server, I convinced myself that they are not very useful on a development machine. – Francesco Marchetti-Stasi May 24 '21 at 15:03
I am tempted to ask why you differentiate between server/development box (it sounds like you will be using different branches between both environments?) **but** if you are clear on what you want to do and how it can be achieved with git, then _by all means_. – eftshift0 May 24 '21 at 16:13
well, I asked the question to learn more about git and how to use it, so any side discussion is welcome :) I differentiate between development and server because on my development machine I always work in topic branches that are not necessary on the git server; when I think I'm ready, I merge the branch on master and then push it on the server. Even the commits I made on the topic branch have really no use on the server (some of them are thing like "placeholder before a potentially disruptive change"...) I thought this was a common way to use git, isn't it?... – Francesco Marchetti-Stasi May 25 '21 at 08:50
Yeah, there has to be some misunderstanding. And sure, you will be more than welcome to make questions. I think all the basics have been covered in past questions, though. :-) When you push a branch to the server, you are not pushing just the tip of the branch... you push all revisions and all the objects they point to. That is, _if you are using git_.... unless, of course, you are using some trick to hack the history of the branch you are pushing into a remote (and there are more than a couple of tricks). – eftshift0 May 25 '21 at 12:53
I'm afraid I wasn't clear enough: I push only master on the server, my dev branch is local to my development machine(s). – Francesco Marchetti-Stasi May 26 '21 at 14:28

matt · Answer 3 · 2021-05-22T04:26:24.427

When I merge dev on master, I usually use --squash to get rid of the irrelevant "development" commits

You are not accomplishing your stated purpose. You are not getting rid of any commits.

Moreover, a squash merge is not a merge at all, and it does not make any connection between this branch and the other branch. Thus the merge base never moves and the history is lost. That's why, as I say here, https://stackoverflow.com/a/67609758/341994, squash merges and long lived branches are opposites.

What you are describing is that you would squash the development commits first to simplify the history, and then merge with a true merge. In other words, stop using merge --squash and instead use squash (using reset or interactive rebase) and then merge.

Yeah, thanks to @torek's answer now I understand it. I chose to keep squashed merges, so I can commit as much as I need on my feature branches (that's what my "dev" branch is, after all), and remove them after merging. — Francesco Marchetti-Stasi, May 24 '21 at 09:03

Why git merge does not create a common ancestor?

3 Answers3

TL;DR

Long