3

If I have a commit history (that was cloned from a remote repository for example) that looks like the following:

A ---> B ---> C ---> D (master)

and I want to create a branch that looks like the following:

A ---> B ---> C ---> D (master)
   \-> N ---> B' ---> C' ---> D' (branch)

That is, insert a commit called 'N' on the branch right after commit A then copy to this branch the commits B, C, and D. Let's say I use the following steps to create the branch 'branch':

create 'branch' -> make commit N -> find the hash of the tree blob of commit B using git cat-file -p <hash of commit B> -> run the command: git commit-tree <hash of tree blob of B> -p <hash of commit N> -> move 'branch' to point to B' -> repeat the last two steps for commits C and D

After adding commit N, is there a risk of using the git commit-tree command as is done above to copy commits B, C, D to the branch? Will I loose any files, changes, etc from B, C, or D from doing so (especially if there were conflicts between B and N for example)? As you can see this is different from using cherry-pick in the sense that cherry-pick checks for conflicts between B and B', C and C', D and D' before cherry-picking while this approach does not check for any conflicts.

oguz ismail
  • 1
  • 16
  • 47
  • 69
Doe
  • 185
  • 3
  • 13
  • git-commit-tree is not a command you would run typically, it's used under the hood by other commands. It even says this in the docs [This is usually not what an end user wants to run directly. See git-commit instead.](https://git-scm.com/docs/git-commit-tree). You should be using cherry pick here – Liam May 28 '21 at 06:46
  • Does this answer your question? [How to inject a commit between some two arbitrary commits in the past?](https://stackoverflow.com/questions/32315156/how-to-inject-a-commit-between-some-two-arbitrary-commits-in-the-past) – Romain Valeri May 28 '21 at 07:08
  • There is no risk of losing files because you are copying over entire trees. The new commits replicate the states of the original commits faithfully (except for committer name and timestamps). – j6t May 28 '21 at 07:26
  • When you use `git commit-tree`, Git makes a new commit object. This commit object has no *name* yet, which means it is *unreachable* (`git fsck --unreachable` will find it for instance). Objects that are unreachable may be garbage collected and thrown away by `git gc`, after a grace period. The grace period gives you time to get things done, such as add a few more commit objects and then add a branch name that refers to the whole pile of new commits. – torek May 28 '21 at 08:17
  • The default grace period is 14 days. So this means that the only real danger (assuming you haven't cranked this setting way down) is that you must finish your work within two weeks, or `git gc` might go rip it all out by the roots. Note that commit `N` has no effect on the subsequent commit snapshots, though. – torek May 28 '21 at 08:19
  • ok. For the specific example above - where the commit (N) would become part of the branch 'branch', it seems there is no risk of loosing it for the garbage collector, is this correct? – Doe May 29 '21 at 05:05

2 Answers2

5

It depends on your purpose. Why do you want N in the first place? After you create B', all of the changes introduced by N (the ones between A and N) are just gone, because now the files are exactly the same with those of B. At last, the files of D' are completely the same with D.

If you want there to be a snapshot of N in the history so that you can check it out later, it's okay to do so.

If you want the changes introduced by N to exist on the new branch, you need git cherry-pick or git rebase. If so, the only harmless case I can think of is that the changes between A and N is a subset of the changes between A and B. In other words, you want to split B's changes into N and B'. As @torek pointed out in the comments, the changes are not really gone, they just have no effect on the subsequent commit snapshots, as if the changes have been reverted in B'.

In practice, I've used git commit-tree in 2 cases. We had a messed-up branch whose history was long. It was quite hard to make it clean by git revert or git rebase, so we decided to reset the branch to a previous good commit. But we are not allowed to delete&recreate or force-push any existing branches. So we used git commit-tree -p HEAD -m xxx ${good_commit}^{tree} to reset the code to the status of the good commit, while we kept the whole messed-up history.

The other case is we once wanted branch foo to have the same code with branch bar. foo and bar had been diverged and developed since quite long ago. So we used git commit-tree -p foo -m xxx bar^{tree} to reset the code while keeping the history.

ElpieKay
  • 27,194
  • 6
  • 32
  • 53
  • Good examples. In my case, I have an SVN repository that I cloned using `git svn` but this command does not clone the submodule from SVN. So I wanted to create a branch that takes a commit from the SVN-cloned branch - in this case (A), then add to it its submodule - which results in a new commit (N), then copy the rest of the commits from the SVN-cloned branch on top of commit (N). And one way I thought of to copy the content of B - D on top of (N) was to find the tree blob of (B) and create a new commit blob that points to (N). Then repeat for C and D to eventually get the history shown above – Doe May 29 '21 at 04:51
  • You might suggest to use cherry-pick instead but it uses diff and this could cause conflicts and I will have to resolve these manually for every commit that I add its submodules in the entire commit history. With the use of `git commit-tree` I could copy the content of one commit as is to a new parent without worrying about conflicts - but I wanted to make sure I will not loose any files or data or have problems down the road with the use of `git commit-tree` versus `git cherry-pick` – Doe May 29 '21 at 05:03
  • Hi, wondering could you help me explain what the `^{tree}` means here? saw many usage in git commit-tree like `HEAD^{tree}`. I understand the `^` means go back some commits, but what does the {tree} mean? Googled but it always lead me to the git tree explanation :/ – lanyusea Oct 28 '21 at 07:30
  • @lanyusea it's described at https://git-scm.com/docs/gitrevisions#Documentation/gitrevisions.txt-emltrevgtlttypegtemegemv0998commitem. Here `^` does not mean to go back some commits. – ElpieKay Oct 28 '21 at 07:56
0

This doesn’t seem to be what you want, based on your questions and concerns.

You ask if you will lose anything from commits A, B, and C, and you will not. The only commit that you should be concerned about is N.

With git-commit-tree(1) you use the snapshot of each of the commits. So

git diff D D'

will return nothing because the trees are identical.

But what about the files in N? They become irrelevant.

Let’s just forget about text because it is too distracting, since you can easily diff them and hence it’s easy to oscilate between thinking of texts as snapshots and texts as diffs. Imagine if these were pictures.

A: donkey --- B: horse --- C: manatee --- D: elephant

Each commit is just one picture. One picture of an animal.

Meanwhile on the new branch:

A: donkey --- N: eagle --- B': horse --- C': manatee --- D': elephant

None of the preceding animals matter if you’re on D or D'—there’s only an elephant. The fact that you introduced N with its eagle has no impact on B'–D'.

(Back to the land of text) So what this means more concretely is that if you for example introduce a file README.md which is not in any of the other commits then it won’t be in in B'–D'. And in fact none of the contents of N will matter. And that’s most of the time not what you want. Which is why one typically uses git-rebase(1) or git-cherry-pick(1) instead.

Guildenstern
  • 2,179
  • 1
  • 17
  • 39