Create an intermediary commit in history

Question

I've been tasked with replicating our centralized VCS solution onto a git repository. My plan is to use a Jenkins pipeline to absorb commits from the CVCS, pull latest from git, then commit those changes onto git and push.

The goal is to capture every commit as it comes in, for which I'm using a webhook from the CVCS that passes relevant information to the pipeline to identify the particular commit in history. For the purposes of this question, assume all development is done in one unending line with no branches.

My concern is the asynchronous nature of a webhook call. If I have commit 1000 synced to git, then commits 1001 and 1002 both come in nearly simultaneously, the Jenkins pipeline might run 1002 before 1001. That in itself isn't a huge problem (I intend to include a text file on the git repo to keep track of the last synced CVCS commit), but it would be nice if I could handle 1001 in a more intelligent way than to discard it.

Is there any way to create an intermediary commit in git history? Something like:

git checkout HEAD~1
git checkout -b some_temp_branch
git commit -m "This is commit 1001"
git checkout master
git merge some_temp_branch -Xtheirs         # or something?

where the resulting history goes from:

1002
 |
1000

to

with no change in the working directory?

Also acceptable, though less ideal than the above, would be to simply create a new commit at the right point in history that uses the CVCS's commit message and no content. That might be doable with a rebase? — Adam Smith, May 28 '19 at 18:10
you can easily reorder commits using interactive rebase. If your commits show up in the wrong order like `1000, 1002, 1001`. Just do an interactive rebase and change the order of the commits in the editor and save. — bruceg, May 28 '19 at 18:45
Maybe not a duplicate *per se* [here](https://stackoverflow.com/questions/32315156/how-to-inject-a-commit-between-some-two-arbitrary-commits-in-the-past), but probably related? — Romain Valeri, May 28 '19 at 18:45
@RomainValeri indeed, since this is an automated process it is a requirement that it is scripted. — Adam Smith, May 28 '19 at 18:56
https://stackoverflow.com/search?q=%5Bgit-rebase%5D+non-interactive — phd, May 28 '19 at 19:31

Schwern · Answer 1 · 2019-05-28T23:34:39.377

Whether you are converting to Git, or want to create a mirror of your repository in Git, these tools already exist. How well developed they are depends on the VCS. For example, git-svn is very well developed; Git can act as a two-way mirror of Subversion. Others are available online, look for <vcs name>2git. For example, if you're using CVS you'd look for cvs2git or git cvsimport.

If a Git migration tool does not already exist for your VCS you can build one by turning your commits into a format suitable for git fast-import. Rather than involving Jenkins you would use the normal VCS client to retrieve the latest commits; there would then be no order issue.

If the intention is to convert to Git I would recommend inverting your process. Which approach is best depends on your situation, but generally it's better to use Git as the leader repository and your old VCS as the follower. Git is very flexible and powerful. It can adapt to how you were using your old VCS, but your old VCS cannot adapt to Git. Making Git the follower restricts you to using version control the same as you are now; what's the point of converting?

If Git is the leader, Git can act in a centralized mode as you're used to. Be sure to only commit to master and always put your local commits on the tip of master using git pull --rebase before pushing. Meanwhile you can explore Git's new features; for example, feature branches. If it's necessary to keep the old VCS around create a read-only mirror by mirroring commits to the master branch into the old VCS.

score 0 · Answer 2 · answered May 28 '19 at 21:36

It's worth noting that no matter how you do this, what you get is not an inserted commit.

If you rebase—however you do it—you get instead a new branch, ending with a new commit. The new branch can re-use the name of the old branch (leading to the obvious question, What exactly do we mean by "branch"?), but your new commit chain will have new and different hash IDs from your original commit chain.

That is, if you had:

... <-c1000a <-c1001a   <-- master

because your system thought c1001 should come right after c1000, and now it's realized that, no, there should be a commit in between, you can make a new c1001b:

... <-c1000a <-c1001a   <-- master
            \
             c1001b

but now you have to copy what was c1001a to c1002b:

... <-c1000a <-c1001a   <-- master
            \
             c1001b <- c1002b

You can now have the name master point to c1002b, "forgetting" c1001a entirely:

... <-c1000a <-c1001a
            \
             c1001b <- c1002b   <-- master

The forgotten commit continues to exist (and be valid) for as long as Git remembers its hash ID somewhere (typically, in a reflog entry, for 30+ days, except on --bare repository servers where there's no grace period). If some other Git has grabbed c1001a, that other Git retains it, and may merge it with c1002b—or whatever the latest tip of master is—later since, if they were both named master, you probably want c1001a merged in. (You really don't, but Git doesn't know that.)

If you merge, you just get a merge commit at the end. You add the commit to the new temporary branch:

             c1001a   <-- master
            /
... <-c1000a
            \
             c1001b   <-- temporary

and then git checkout master; git merge -s ours temporary to retain the source tree from c1001a:

             c1001a
            /      \
... <-c1000a        c1002a   <-- master
            \      /
             c1001b   <-- temporary

Erasing the temporary name gives you the final result:

             c1001a
            /      \
... <-c1000a        c1002a   <-- master
            \      /
             c1001b

which won't give any other Git repositories any heartburn, as you didn't deliberately throw out a bad commit in favor of a new and improved version (that the other Git might then bring back). Instead, you merely added commits, which all Gits understand other Gits to do at any time.

The drawback to the merge method is that this is perhaps not the correct history.

In the short term, at least, I'm not concerned about polluting things for downstream clients. The design is that since we're pulling from a CVCS, we're treating git as a CVCS for the purposes of this task. I'm allowed to `git push --force` with arbitrary changes and downstream clients are expected to handle that behavior. — Adam Smith, May 28 '19 at 22:25

Create an intermediary commit in history

2 Answers2