3

My question arose when I read in git rebase doc , that

If the upstream branch already contains a change you have made (e.g., because you mailed a patch which was applied upstream), then that commit will be skipped. For example, running git rebase master on the following history (in which A' and A introduce the same set of changes, but have different committer information):

      A---B---C topic
     /
D---E---A'---F master will result in:

               B'---C' topic
              /
D---E---A'---F master

One way is to see the patch Id using git patch-id , but that is not what I want.

Let me have 2 branches. Topic and master and I am changing only one file in it.

Inserted 2  ->  T2     M2 <--  Inserted 2 in new line
                |      |       
Inserted 1  ->  T1     M3 <-- Inserted 3 in new line
                  \   /
                   \ /
                    * <--  Contents similar here 

Now at T2 and M2 , patch is not considered same though we are adding 2 in the same new line in both versions of the file (Found this was git patch-id). This finding was surprising for me. I thought patch will be same if same contents on same line is applied in 2 different versions of a file.

This made me think that patch, hence do depends on the previous commit too, where I am applying patch. So, when we say (patch1 on some branch) = (patch2 on some other branch) , then their ancestors also need to be same ? If yes, we can recursively apply this and 2 branches will come out to be identical which is illogical.

So, my question is , when do we say , 2 patches equal (not considering the patch-id) ?

Use this script to reproduce the above in local:

#!/bin/bash

git init .
echo "10" >> 1.txt && git add . && git commit -m "1"

# Add 2 commits to master
echo "3" >> 1.txt && git commit -am "m3"
echo "2" >> 1.txt && git commit -am "m2"


#checkout topic branch
git checkout -b topic HEAD~2
echo "1" >> 1.txt && git commit -am "t1"
echo "2" >> 1.txt && git commit -am "t2"

#Show graph
git log --oneline --all --decorate --graph
Number945
  • 4,631
  • 8
  • 45
  • 83

1 Answers1

4

So, when we say (patch1 on some branch) = (patch2 on some other branch) , then their ancestors also need to be same?

Not for git rebase, no. Rebase uses the same computation as git patch-id, which is nominally a result of hashing the stripped-down (line numbers and whitespace removed) diff text.1

The git rev-list command also does this. See its --left-right, --right-only, --cherry-mark, and --cherry-pick options, which must be used with the symmetric difference three-dot notation commit selectors.

In fact, git rebase uses git rev-list to do the work. In the old days, when git rebase was mostly shell scripts, it was easy to see how this was done. Now it's all built as C code, so instead of running git rev-list, it has the same bits of git rev-list compiled in.

... thought patch will be same if same contents on same line ...

No, the line numbers are removed. This is on purpose: a patch might, for instance, be as simple as replacing a call that passes false with one that passes true, which to Git is:

-    foo(false)
+    foo(true)

(with, in the case of git diff, some surrounding context—it's not clear whether the patch-ID includes the context, but I would assume that it does). Suppose this fix is accepted upstream, while you're working on a feature that may or may not be related to the fix ... but upstream, that call to foo, which was on line 42, is now on line 47 because five unrelated lines were added well above this point?

Rebase should, and does, omit this patch now that it exists in the upstream to which you are rebasing, as determined by doing a --left-right pass over the symmetric difference of the upstream argument to rebase, and HEAD. All the left-side commits have their patch IDs calculated. All the right-side commits have their patch-IDs calculated. If the patch IDs match, the commit is considered a duplicate, and elided from the set of commits to copy.


1In Git 2.39, the patch ID computation code has changed, partly to fix some bugs and partly to allow retaining indentation-related white space. See the new --verbatim option in particular, and the detail in this answer from VonC.

torek
  • 448,244
  • 59
  • 642
  • 775
  • When you say , fix is accepted in upstream , then did you mean that the patch I have also applied on branch that wanted to rebase. Sorry , I did not understood it completely. Also, if 5 lines are added before patch was applied on upstream , then patchId will be different (as context will differ) , however, should not rebase still omit the patch in the branch that we are rebasing? – Number945 Nov 04 '19 at 19:56
  • Remember that Linus Torvalds & co tend to use an email based patch system. So "accepted in upstream" means: developer worked on feature, found bug, fixed bug, sent a commit to Linus. Linus verified that this commit fixed the bug and put it into Official Linux, as a *different commit* (different hash ID, next-release branch, etc.). Now the developer is about to rebase his feature on the next-release-upcoming branch, which has his patch (as committed by Linus) in it. It's not made to the same source lines because the source moved around within the file, but it *is* in the branch. – torek Nov 04 '19 at 20:08
  • If I say `git rebase master topic` , then what `rev-list` command will git run ? Will it be `git rev-list --left-tree master...topic` ? – Number945 Nov 13 '19 at 18:01
  • Look at the old shell-script version of [interactive rebase](https://github.com/git/git/blob/a17c56c056d5fea0843b429132904c429a900229/git-rebase--interactive.sh#L967-L986). (The new rebase has `git rev-list` built in as C code.) – torek Nov 13 '19 at 18:14
  • Note: `git patch-id` no longer *always* stripped down diff text, with [Git 2.39 (Q2 2022)](https://stackoverflow.com/a/63674369/6309). – VonC Oct 31 '22 at 16:15