4

In the process of moving to use git we have taken a production version of a solution and committed it as master.

Then we took a development version and made an orphan branch called develop.


(Background: why we are getting a bit tangled up here is that the there was not a clean evolution from the development version to the production version. Plus there is a complexity to assembling the solutions involved that makes us want to avoid scrapping the repo and trying again. In the end, we just want to get these versions into git and commence the cleanup within git.)


So - now we think it would have been better to branch the development version from master rather than keep it as an orphan branch.

How can we basically take the master commit and make it the parent to the develop commit but without any merging taking place? Without changing the file contents of that first develop commit?

That is - to just graft develop onto it somehow, as-is, if that makes any sense?

El Entrenador
  • 89
  • 1
  • 4
  • `git rebase --onto master develop`? – siride Oct 05 '15 at 17:25
  • After some earlier dabblings, I came away with the impression that rebasing always tries to merge, am I wrong? The "--onto" stops that? – El Entrenador Oct 05 '15 at 18:02
  • Rebasing doesn't merge, per se. However, if a commit in the rebased branch introduces a change that conflicts with a change already made in the rebased-onto branch, then you will have a merge conflict. It's unavoidable from first principles. All I can say for your case is to try it and hope that you don't have conflicts. – siride Oct 05 '15 at 19:56

1 Answers1

1

It sounds like you want to modify your commit graph without modifying any of the trees attached to those commits.

This statement will make more sense if you understand how git works internally. The key items here are:

  1. All commits are permanent and unchanging, because the "true name" of a commit (or indeed any of git's four internal objects) is the SHA-1 crypographic checksum of its contents. This means that if you attempt to change anything (or a failing disk drive changes something), git will complain of a bad checksum since what you see in each object doesn't match its "true name" any more.

  2. Each commit carries with it the SHA-1 ID of a "tree" object that acts as the complete snapshot of the source that goes with that commit. (The tree gives the file-names and SHA-1 "true names" of each file or directory. with sub-directories being represented by yet another tree. The details don't matter too much here.)

  3. Each commit also lists the "true name" SHA-1 IDs of its parent(s). Thus, given the tip commit of develop, git can read that commit and find its immediate parent(s). Reading that parent (or those parents), git finds the next ID(s), which it reads for their parents, and so on. The process stops upon reading a "root" commit, which is one with no parents. Doing an --orphan checkout followed by making a commit results in a new root commit, which is no doubt how you made your branch.

These parent and tree IDs, as stored in any given commit, are said to "point to" the other objects in the git repository. (There are only four types of objects in a repo. We already mentioned "commit" and "tree", the other two are "blob"—which is how git holds a file—and "tag". Trees point to blobs and sub-trees, and "tag" objects are used for annotated tags.)


Thus, what you want to do is to change the root commit of your develop branch so that it now has a parent commit, in particular some commit that is reachable from the current tip of the master branch. (Perhaps you want the tip itself, perhaps you want something like master~100: this detail only matters when you go to make the change.)

The bad news is that because of item #1, you can't quite do this.

The good news is that you can sort-of do this using any of three alternative methods (and then make the graft permanent, if needed and desired).

First, git has a thing called "grafts". They don't work too terribly well, so git has a new better thing called "replacements". Depending on your git vintage you should have at least one, and almost certainly both.

These both use the same general idea. As git is doing its graph traversal, going from commit to parent, you want, at some point, to be able to get git to change its traversal. Using git grafts, you simply specify that when on object with id <X> it should traverse to parent <Y>. This lets you find the root commit on your current develop and graft it directly onto some commit in master.

These grafts do not get copied by git clone, and introduce other issues, so now git has git replace. This creates an actual object in the repository, and allow you to inspect the commit graph both with and without replacements. To use it for this case, you would make a replacement commit for your root-commit, that's exactly like the existing root-commit except that it has your desired parent.


If you want to rewrite history entirely, making it easy for all users after the one painful rewrite step, you can set up a graft and then use git filter-branch to copy-and-replace the graft. Or, perhaps slightly easier, specify a parent-filter (and no other filters). See the git filter-branch documentation for details; it has an example of this very case. (The documentation suggests that creating the graft is simpler, I think using --parent-filter is simpler; but either way, there's a handy example.)

Note that filter-branch copies every "filtered" commit, with some change(s) applied, by (virtually) checking out that commit, then applying the filter(s), then creating a new commit from the result. In this case the first change is to add a parent ID to the root commit. The second change is one filter-branch does automatically: make the commit that follows the no-longer-orphaned root-copy point to the root-copy. The third change is the same as the second, and so on:

      A--B--C     <-- develop (before filtering)

...--o--o--...    <-- master
      \
       A'-B'-C'   <-- develop (after filtering)

Here, commit A' is exactly like commit A except that it has a parent ID; B' is exactly like B except that its parent is A' instead of A; and so on.

(Note, by the way, that the parent arrows all point leftward here, but filter-branch works left-to-right. It does this by enumerating all the commits to filter first, getting their IDs in right-to-left fashion, bu then doing the filtering left-to-right.)

This existing SO Q-and-A has a lot more on using grafts or replace.

I did mention a third method above. This one works only when develop has no existing merge commits. You can then simply rebase every commit in develop onto your target commit, as siride suggested in a comment. You'll need to run that rebase command with --root to copy every commit from the root of your independent develop branch.

This works because git rebase simply copies commits, much the same way git filter-branch does. It's more work in one sense, because the way rebase copies commits is by applying diffs (using repeated git cherry-picks, more or less) rather than simply keeping each copied commit's existing "tree" object, but it's a lot easier to work than git filter-branch. The drawback here is that rebase does not handle merge commits at all (well, not unless you use --preserve-merges, which uses the interactive rebase code).

Community
  • 1
  • 1
torek
  • 448,244
  • 59
  • 642
  • 775
  • Thanks for your help. Tried the git rebase but it came back with merge conflicts. Probably your --parent-filter approach would work.But if following your link to the other SO question, I found this "git reparent" script - looks like it should do what we want: https://github.com/MarkLodato/git-reparent/tree/e886dc7e970f8c05ee7b82a6f8d855ffb7cc4d88 – El Entrenador Oct 06 '15 at 17:35
  • According to the documentation, that script only copies the `HEAD` commit (which is easy but won't help if you want a whole chain of commits copied). – torek Oct 06 '15 at 18:05