How to do a branch merge directory by directory in multiple steps

Question

Our Git source repository has multiple modules. Each module resides in a different directory. Each module has a specialized developer.
e.g.

src
|--module A --
|--module B --
|--module C --

Now we have 2 branches. Say working branch is "develop" and other branch is "other" (a least active branch). We want to merge "other" to "develop". Is it possible to do this directory wise?

e.g. I want to merge only "module B" now. Then do new changes on the merged code and commit to "develop". Then after some time (say 2-3 months) a different developer(s) need to merge module A and C from "other" to "develop".

i.e. simply, is it possible to break Git merging to multiple steps by sub directory?

Please note that "other" branch has large no of commits. So identifying all the commits that resulted in a change in "module B" and cherry picking one by one is not feasible.

Answer I can so far come up with is

Start the other -> develop merge
Resolve merge conflicts ONLY in "module B". Backup "module B" directory to a different location.
Discard the #1 started merge.
Create a new branch on develop -> delete "module B" directory -> copy above #2 backed "module B" to this branch -> commit -> merge to develop branch -> carry on my new changes in "module B"

Downside is,
When "other -> develop" merge is done in future, there will still be merge conflicts for "module B". Those conflicts will have to be ignored(i.e. "use local file" option has to be used in git gui). Then any newer changes done in "other" branch for "module B" has to be cherry picked/copied manually.

Does this answer your question? [How can I selectively merge or pick changes from another branch in Git?](https://stackoverflow.com/questions/449541/how-can-i-selectively-merge-or-pick-changes-from-another-branch-in-git) — mkrieger1, Mar 29 '21 at 15:05
it's accepted answer is to use cherry pick. but this is not feasible as there are too many commits in "other" branch for "module B". Also the rest of the answers seems to explain how to merge "module B" from "other" to "develop", BUT after that no more merges (i.e. module A and C's changes can NEVER be merged) — aKumara, Mar 29 '21 at 15:52

score 2 · Answer 1 · answered Mar 30 '21 at 00:24

The short answer is no. But that answer is (a) unsatisfying and (b) leaves out what you can do to approach your final goal.

I think this is best illustrated by making some drawings. Let's make a simplified picture of your problem, showing things the way Git sees them, commit-by-commit. We begin with some common commit at the end of some sequence of commits:

...--F--G--H   <-- main

We now make some branches (and stop using the name main entirely for now) and on these side branches we make some commits:

          I--...--J   <-- branch1
         /
...--G--H
         \
          K--...--L   <-- branch2

There are probably many commits on the two non-main branches here, even though we only show two.

When you merge these, the way Git achieves the merge result is this:

Git extracts H's snapshot, and then J's snapshot, and compares the two to see what changed on branch1.
Git extracts H's snapshot, and then L's snapshot, and compares the two to see what changed on branch2.

In your cases, this is ... a whole lot of changes. :-) It's so many that you don't want to handle them all. Git, however, will insist on handling all of them. If you get conflicts, git merge will stop in the middle of the job and exit and force you to fix all the conflicts. However you go about doing that, once you're done, Git will believe that the resulting files are the correct merge result.

To finish the merge after resolving everything, you run git merge --continue or git commit. This makes a new merge commit M:

          I--...--J
         /         \
...--G--H           M
         \         /
          K--...--L

Commit M, like every commit, has a full snapshot of all files. This is by definition the correct result of this merge. That's why the short answer is no.

Now for the trick

But—here is why the simple "no" answer is incomplete—you will note that I left off all the branch names in the above. Suppose, before beginning the merge, we create one new branch name, and make that our current branch name, like this:

          I--...--J   <-- branch1, merge-A (HEAD)
         /
...--G--H
         \
          K--...--L   <-- branch2

When we finish the merge, we will have this:

          I--...--J   <-- branch1
         /         \
...--G--H           M   <-- merge-A (HEAD)
         \         /
          K--...--L   <-- branch2

Note that no existing commit has changed, and only the name merge-A has changed, to point to new merge commit M.

What this means is that you can now use all three branches to continue working, to whatever extent you like. Then, on the next day (or week or month), you can go back to branch1 and make another new name, such as merge-B, and make that the current branch:

          I--...--J   <-- branch1, merge-B (HEAD)
         /
...--G--H
         \
          K--...--L   <-- branch2

(This assumes branch1 and branch2 have not moved. If they have moved, it's probably wise to find the hash IDs of commits J and L, and use them here. Note that they are find-able as the two parents of merge commit M, which still exists—I have just chosen to stop drawing it for the moment.)

We now run the same kind of git merge, to make a commit MB or M2, in which we really only resolve the module-B files. As far as Git is concerned, this is the correct result of this merge, i.e., the module-A and module-C files are all correctly merged (regardless of what we stuck in them for the MB snapshot). We must remember that modules A and C are mis-merged, just as when we made M, we had to remember that modules B and C were mis-merged. But in any case we get this result:

          I--...--J   <-- branch1
         /         \
...--G--H           MB   <-- merge-B (HEAD)
         \         /
          K--...--L   <-- branch2

MB is a new commit, separate from M (or MA if we want to start calling it that).

Later, we can repeat this to make an MC merge.

If we try to draw them all in, it gets pretty messy:

...--J____
     |\   \
     | \   MA
     |  \ /
     |   X
      \ / \
       X   MB
      / \ /
     |   X
     |  / \
     | /   MC
     |/___/
...--K

(this would go better with graph-drawing tools, obviously).

You are now free to extract particular files from each of the three MA, MB, and MC commits, and combine them to produce some new merge commit. Exactly how you go about doing that is up to you. Exactly what you make in terms of new commits to record this is also up to you. I myself might go for a new merge commit with just parents J and L—i.e., a fourth merge, made in the same way that each of these three merges were made, but with its snapshot built by extracting the three merge results.

Note that this is not "free"

If development continues after making the three separate MA, MB, and MC merges, you're still stuck with merging in the post-MA, post-MB, and post-MC commits. This requires running git merge again and getting still more merge conflicts. It will probably be very important to record very explicit merge messages in MA, MB, and MC—and perhaps even in some subsequent commits—to remind yourself and/or others that there is still work to be done in the future. Then, when you make the "correct" merge of that merges all three modules, you'll need to do the extra merge work, perhaps recording the branch-tip commits from post-MA, post-MB, and post-MC commits as the second commits of each of these various merges.

The final graph will be very tangly and confusing. That's somewhat unavoidable here. Using octopus merges, one can perhaps afford each of the many legs of the octopodes equal "status" or "priority" in terms of the final result, but that is also a confusing way to work. Whether Git's built-in -s octopus strategy—which is how Git normally makes an octopus merge—can handle this is questionable; if it can't, you can use git commit-tree to make these merge commits manually, or just use regular two-at-a-time merges to build the final result.

The benefit, if any, of such an octopus merge is not very large. Here's a hand ASCII drawing of what one might look like, building on the three-M merges:

...--J____
     |\   \
     | \   MA
     |  \ /  \
     |   X    \
      \ / \    \
       X   MB--MM   <-- main
      / \ /    /
     |   X    /
     |  / \  /
     | /   MC
     |/___/
...--K

but note here that no work has happened since each of the three separate M merges. Having made MM—the master merge of all modules—it's now time to tackle that work.

+1 since this adds something useful, BUT for simplicity, let's say only MB(ONLY module B conflicts resolved) exists. Then I copy all the files in Module-B dir to branch-1(let's say this is the Active development branch) and make a commit in branch-1. After some time branch 2 is attempted to be merged to branch-1. (say no changes done on branch-2 from MB creation point) does GIT still show merge conflicts for module-B in this latest merge attempt ? is this the approach you are suggesting here ? (except for "Exactly how you go about doing that is up to you" part) — aKumara, Mar 30 '21 at 05:54
It's important to remember that *branch names* just *find commits*. What Git cares about are the commits and the commit graph. A `git merge` follows the linking connections backwards from commits, to find the best *shared* commit on both branches. So if you make new commits while `on` branch-1` or `branch-2`, this creates a new commit that links back to some existing commit. (Draw this out!) The *name* now points to the *new* commit. A future `git merge` is going to use the arrows from commits to find a merge base. The files *in* that commit act as the input to the future `git merge`. — torek, Mar 30 '21 at 06:01
Any merge conflicts you get will depend on the answers to these questions: (1) which commit was the merge base? (2) which were the two tip commits? (3) what was the *diff* from merge base to each of the two commits you chose to merge? — torek, Mar 30 '21 at 06:02
When you go to make a new commit—whether it's a merge commit or an ordinary single-parent commit—the *files that will be in it* are up to you. Typically you arrange the contents by updating your working tree files (however you like), and using `git add`. That's what ends up in the commit: the contents of the files in Git's index, as updated by `git add`. — torek, Mar 30 '21 at 06:03
If you want to get some file(s) out of some commit(s), the `git restore` command (Git 2.23 or later) or `git checkout` command (pre-Git-2.23) will let you do this. You can get an entire directory (folder) worth of files at once, if you like, from any existing commit. This is where the "up to you" part comes from: if merge commit MB has correct module-B files, you can get them out of commit MB at any time in the future, using `git checkout` or `git restore`. — torek, Mar 30 '21 at 06:06
to re-state the question with answers to your questions, (1) merge_base of MB=branch_1, (2) tip commits - J and L, (3) let's say there were diffs in large no of files across all modules. ONLY module-B conflicts resolved . Now, (1)Is there a difference between manually copying/git restore/git checkout directory/files from MB to branch 1 ? (2) Will any of them hinder a later branch 2 to branch 1 merge ? (3) will GIT show merge conflicts for module-B in latest merge attempt (or will GIT consider them as already merged) ? — aKumara, Mar 30 '21 at 07:13
Q1: No: the contents of a commit are just the contents, regardless of how you got them there. (This is true for both snapshot and metadata, but metadata is provided by Git itself, so we're mostly interested in snapshot/files.) Q2: if it makes a messier merge conflict, you can call that a hinderance. If it's only in files you intend to take wholesale, I wouldn't (call it a hinderance). Q3: you say "merge_base of MB=branch_1" but a merge base is from *two* commits, so without sitting down and drawing the graph, I'm not sure what merge conflicts we'll see. — torek, Mar 30 '21 at 07:22
sorry, what I meant by "hinderance" was, will getting ONLY module-B merged files and merging it to branch-1 mark as ALL modules merged (by GIT) and will NOT get ANY changes of modules A and C to branch-1 on a future br-2 to br-1 merge. For "Q3 you say "merge_base of...", I was referring to the diagram you drew in your answer. Merge base is commit "J" — aKumara, Mar 30 '21 at 07:46
to re-state the question, referring to your diagram, (say branch-2 does NOT grow beyond L), I am looking for a solution "All changes in branch-2 for ONLY module-B merged to branch-1. After this more changes done to all modules in branch-1". Then later, branch-2 need to be merged to branch-1 (this time ALL modules need to be merged). Now conflicts should be shown for "modules A and C(i.e. they should NOT disappear due to previous module-B ONLY merge), but NOT for module-B" — aKumara, Mar 30 '21 at 07:57
It's not a question of "mark merged", it's a question of finding the merge base the next time around. I can't draw this properly in a comment, but: suppose branch X ends with commits ...-(J+L)-M-N and branch Y ends with commits ...-L-O-P, where M is a merge commit merging J and L as above. If we now ask Git to merge commits N and P, Git will find the *merge base* by working backwards from N (N, M, J-and-L) and also from P (P, O, L). This makes commit L the "merge base". The merge now diffs the snapshot in L vs that in N and vs that in P. These are the changes that must be combined. — torek, Mar 30 '21 at 09:21
The conflicts, if there are any, arise from combining L-vs-N and L-vs-P. What *files* are in L? (These are from before merging the three modules, in your example.) What *files* are in N? (These depend on what you did after M.) What *files* are in P? (These depend on what you did after L.) — torek, Mar 30 '21 at 09:22
If branch-2 still ends at commit L, and you do another merge from some other commit using branch-2 to select commit L, well, now you must find the merge base of the other commit and commit L. Work backwards from the other commit, and work backwards from L. Which commit did you find? What snapshots are in the three commits that you found? — torek, Mar 30 '21 at 09:23

How to do a branch merge directory by directory in multiple steps

1 Answers1

Now for the trick

Note that this is not "free"