What's the best practice using git to switch branches with just one commit different and all others are the same?

Question

I'm working on a project that I want to use different branches with different database connection paramaters. It's now like this:

             K--L--M--N--O   <-- master (using MySQL)
            /
...--F--G--H
            \
             I   <-- PostgreSQL

The commit K is set the connection parameters to MySQL while the commit I is replacing the database connection parameters to PostgreSQL.

Now I want the PostgreSQL branch to have all the L, M, N O commits. Just one commit is different but all others are the same. Something like this:

             K---L--M--N--O   <-- master (using MySQL with the K commit) 
            /   /             <-- PostgreSQL (with the I commit)
...--F--G--H   /
            \ /
             I

However, after some research, I found this might be impossible? It is said the branch is juat a pointer to that commit O's hash not a record of history. Is it true?

So I searched even more and found some answers here: https://stackoverflow.com/a/4024138/12007788
https://stackoverflow.com/a/18529576/12007788

These answers suggested cherry-pick. So I used

git checkout PostgreSQL
git cherry-pick L^..O

Now, it looks like this.

             K--L--M--N--O   <-- master (using MySQL)
            /
...--F--G--H
            \
             I--L'--M'--N'--O'    <-- PostgreSQL

This works but the git log is not very clean. I use git log --all --decorate --oneline --graph, the L M N O and L' M' N' O' commits have different hashes and they repeat appear in the log. Also, in the future I need to commit to both of the branches. I guess this may not be the best practise? Is there a good way to deal with this issue?

Why not make the database connection a matter of *configuration*, rather than *code*? Otherwise as you've seen you'll have identical commits on the two branches but with different parents and therefore different hashes. — jonrsharpe, Apr 08 '20 at 06:57
@jonrsharpe I haven't explored that. Now the database connection hard coded in application.properties of my SpringBoot app. What is the best practise for doing it as a matter of configuration? — Nublia, Apr 08 '20 at 07:08
If it's in the properties file, it already is configuration. You could have multiple profiles, or override it from an env var, or anything else in https://docs.spring.io/spring-boot/docs/current/reference/html/spring-boot-features.html#boot-features-external-config — jonrsharpe, Apr 08 '20 at 07:10

score 1 · Accepted Answer · answered Apr 08 '20 at 12:30

... It is said the branch is just a pointer to that commit O's hash not a record of history. Is it true?

That is correct. History, in Git, is the set of commits. A branch name simply selects one commit to be the end of the history for that branch. Running git checkout branch-name selects the given branch-name as the current branch, and hence the commit to which the name points as the current commit. This extracts the selected commit into Git's index (from which Git will make the next commit) and into your work-tree (where you can see and work on your files).

Running git commit takes whatever is in Git's index at that point, adds the appropriate metadata including the current commit's hash ID, writes out a new commit—which acquires its unique hash ID in the process—and then writes the new commit's hash ID into the current branch name, thereby making the name point to the new commit.

No commit—no internal Git object of any form, really—can ever be changed, not one single bit, once it's made. The reason for this is that the hash ID is a simple cryptographic checksum of the object's content. Change the content and you change the hash ID: you have a new, different object; the existing object remains unchanged.

Since commit L (whatever its real hash ID is) already exists, its content is now frozen for all time. That includes the fact that it has just one parent, commit K. You can make a new merge commit—a commit with two (or more) parents—whose parents are both I and whatever you like—say O, for instance—and make your branch-name PostgreSQL point to that new commit:

             K--L--M--N--O   <-- master (using MySQL)
            /             \
...--F--G--H               \
            \               \
             I---------------P   <-- PostgreSQL

You would obtain this state by running:

git checkout PostgreSQL; git merge master

possibly with some additional options, and possibly with some steps of resolving conflicts.

The snapshot (i.e., data contained in all the files) stored in commit P can be anything you like, although one is in general encouraged to let Git build the snapshot on its own. Git does this by combining work done (changes made) since the merge base, commit H in this case, by diff-ing the snapshot in H vs that in I to see what you changed, diff-ing the snapshot in H vs that in O to see what they changed, and combining the two sets of changes.

Because the snapshot in O contains changes that were introduced into the snapshot in K, the diff from H to O will include the changes you'd see by comparing H to K. So the result of this combining will include changes you did not want.

Again, the snapshot in P can be whatever you like: you can tell Git not to make commit P yet, and then make further changes to the copies of your files that Git has stored in Git's index.¹ That is, you could run git merge --no-commit: Git would do its best to combine changes as usual, and perhaps even succeed on its own. But you would then take the resulting files—as currently stored in both Git's index and your work-tree—and change them to, in effect, "undo" commit K. Since you can only edit the work-tree copy, you must then git add the updated file(s) as usual, after which:

git merge --continue

(or git commit) will commit the result. This produces what is sometimes called an evil merge.

Instead of an evil merge, you can commit a normal (non-evil? good?) merge and immediately run:

git revert <hash-of-K>   # or master~4, if I counted correctly

to make a new, additional commit that has the effect of undoing the changes from commit K, producing:

             K--L--M--N--O   <-- master (using MySQL)
            /             \
...--F--G--H               \
            \               \
             I---------------P--Q   <-- PostgreSQL

If you diff the snapshot in P against the snapshot in Q, the changes you see will be the opposite of the changes you'd see by diffing the snapshot in H against that in K. So commit Q would match P except for undoing (reverting) K.

As discussed in comments, storing configuration information separately, rather than in committed files, is probably the way to go instead; but the above is one way of achieving the result you want. The differences between each of the various possible approaches show up, not today, but at some point in the future, when you'd like the two branches to be less different, or not even to have two separate branches at all.

¹Technically these are not separate copies, but the point here is that commit P will be made from whatever is in Git's index. If you use git merge --no-commit, then modify work-tree files, you must also run git add on those modified work-tree files, to update the index.

What's the best practice using git to switch branches with just one commit different and all others are the same?

1 Answers1