Why is "git format-patch" between two commits different from "git diff"?

Question

Suppose there are 2 branches x1 and x2 both based on master. git diff x1..x2 shows the diff between x1 and x2, but git format-patch x1..x2 -1 --stdout shows the diff between master and x2. Is there a way to let format-patch to show the diff between x1 and x2?

For example:

git branch x1 master
git branch x2 master
git switch x1
echo a >> README
git commit -a -m x1 # one extra line at x1
git switch x2
echo a >> README
echo a >> README
git commit -a -m x2 # 2 extra lines at x2

git diff x1..x2 # show one line betwen x1 and x2
git format-patch -1 --stdout x1..x2 # show 2 lines between master and x2

If I understand correctly, it seems git format-patch only emits real commits (could be stashed) but it will not try to calculate interdiff.

score 2 · Answer 1 · answered Jul 08 '20 at 20:49

Why is “git format-patch” between two commits different from “git diff”?

The interpretation of .. is different for the two commands. git diff is looking for two commits, git format-patch is looking for one or more patch series.

They're both trying to be helpful, to turn what you told them into something they can work with.

git diff effectively ignores the two dots. It's looking for two commits, you gave it two commits, those are the two commits. This is perhaps less baffling when you consider that it also has its own (quite useful, and sensible) interpretation of the three-dot syntax.

git format-patch is looking for patch series, so it uses the more common dots interpretation shared among all git rev-list-driven commmands.

score 1 · Answer 2 · answered Jul 08 '20 at 19:59

TL;DR: if you're looking for something like an interdiff, consider git range-diff.

Long

If I understand correctly, it seems git format-patch only emits real commits (could be stashed) but it will not try to calculate interdiff.

This is correct, but may be coming from some wrong assumptions.

In particular, git format-patch is a method of turning a commit—or a whole chain of commits—into something that will survive email. Whoever gets the patch can use git am to create an equivalent commit, preserving most of the metadata and the git patch-id.¹

Since this is aimed at preserving a chain of commits, format-patch looks specifically at a single chain of commits. The x1..x2 syntax you used is allowed, but when giving it a single revision, that's treated as a since specifier—that is, git format-patch X means git format-patch X..HEAD.

Note that in this kind of revision range—A..B for any valid specifiers A and B; see the gitrevisions documentation for more detail—commit B acts as a positive reference and is normally included in the selected commits, but A is a negative reference, as in ^A, and is always excluded. Hence if A and B name the same commit, B winds up being excluded too. However, you can exclude a commit that normally wouldn't appear anyway. In particular, you have created this situation:

       I   <-- x1
      /
...--H   <-- master
      \
       J   <-- x2

such that the commits selected by x1..x2 are always just commit J, and the commits selected by x2..x1 are always just commit I—the same set you'd get with master..x2 and master..x1 respectively.

(Note: the -number option to git format-patch forces a change in behavior when given a single commit specifier, instead of a range. It doesn't seem to have any effect when using a range specifier.)

The git diff command, however, does not obey these rules. Normally, a range A..B means all commits reachable from B, excluding all commits reachable from A, which is what we saw above with the master/x1/x2 fork. But git diff A..B just means git diff A B. Git simply extracts commit A to use as the left side, and extracts commit B to use as the right side, and compares them. There is no interdiff here either, nor any attempt to reach back to the common commit on master.

The git diff command has a special interpretation for the three-dot range syntax too: git diff A...B means find a merge base between A and B, then compare that to commit B. This is also not any kind of interdiff—git diff just doesn't do that at all.

¹The committer and committer-date are, by default, new when using git am. However, you can also add --committer-date-is-author-date. If you do this and you set things up so that your committer name matches the original committer name, and apply it to the right starting commit, you'll even get a commit that has the same hash ID, provided the original commit has the same author and committer dates. That is, with a little effort, sometimes an emailed patch can result in a bit-for-bit identical commit, as if you had used git bundle or similar.

Thanks for the informational explanation. In fact, I was trying to find a diff output that I can directly `git am` without calling `git commit`. Now I think I can simply cat the header of `git format-patch` and the content of `git diff`. — speedogoo, Jul 09 '20 at 07:12

VonC · Answer 3 · 2023-05-20T07:55:58.853

There is another scenario where git format-patch is actually using git diff with more than two commits, as explained in this 2018 thread:

When re-submitting a patch series, it is often helpful (for reviewers) to include an interdiff or range-diff against the previous version.

Doing so requires manually running git-diff or git-range-diff and copy/pasting the result into the cover letter of the new version.

This series automates the process by introducing git-format-patch options --interdiff and --range-diff which insert such a diff into the cover-letter or into the commentary section of the lone patch of a 1-patch series.
In the latter case, the interdiff or range-diff is indented to avoid confusing git-am and human readers.

And, in another 2018 thread:

When submitting a revised a patch series, the --range-diff option embeds a range-diff in the cover letter showing changes since the previous version of the patch series.
The argument to --range-diff is a simple revision naming the tip of the previous series, which works fine if the previous and current versions of the patch series share a common base.

However, it fails if the revision ranges of the old and new versions of the series are disjoint.
To address this shortcoming, extend --range-diff to also accept an explicit revision range for the previous series.
For example:
git format-patch --cover-letter --range-diff=v1~3..v1 -3 v2

This is fixed with Git 2.29 (Q4 2020), where "format-patch --range-diff=<prev> <origin>..HEAD" has been taught not to ignore <origin> when <prev> is a single version.

See commit 07a7f8d, commit 72a7239, commit cdffbdc (08 Sep 2020) by Eric Sunshine (sunshineco).
^{(Merged by Junio C Hamano -- gitster -- in commit 634e008, 22 Sep 2020)}

format-patch: use 'origin' as start of current-series-range when known

^{Signed-off-by: Eric Sunshine}

When formatting a patch series over origin..HEAD, one would expect that range to be used as the current-series-range when computing a range-diff between the previous and current versions of a patch series.

However, infer_range_diff_ranges() ignores origin..HEAD when --range-diff=<prev> specifies a single revision rather than a range, and instead unexpectedly computes the current-series-range based upon <prev>.
Address this anomaly by unconditionally using origin..HEAD as the current-series-range regardless of <prev> as long as origin is known, and only fall back to basing current-series-range on <prev> when origin is not known.

Note that with Git 2.41 (Q2 2023), the default option is clarified.

See commit f024913 (02 Apr 2023) by Alex Henrie (alexhenrie).
^{(Merged by Junio C Hamano -- gitster -- in commit 9e0d1aa, 21 Apr 2023)}

format-patch: correct documentation of --thread without an argument

^{Signed-off-by: Alex Henrie}

In Git, almost all command line flags unconditionally override the corresponding config option (see this thread).

Add a test to confirm that this is the case for git format-patch --thread^(man).

git format-patch now includes in its man page:

--thread without an argument is equivalent to --thread=shallow.

With Git 2.41 (Q2 2023), when "git send-email"^(man) that uses the validate hook is fed a message without and then with Message-ID, it failed to auto-assign a unique Message-ID to the former and instead reused the Message-ID from the latter, which has been corrected.
This was a fix for a recent regression caught before the release, so no need to mention it in the release notes.

See commit 20bd08a, commit 3ece9bf (17 May 2023) by Junio C Hamano (gitster).
^{(Merged by Junio C Hamano -- gitster -- in commit b04671b, 19 May 2023)}

send-email: clear the $message_id after validation

^{Tested-by: Douglas Anderson}

Recently git-send-email^(man) started parsing the same message twice, once to validate all the message before sending even the first one, and then after the validation hook is happy and each message gets sent, to read the contents to find out where to send to etc.

Unfortunately, the effect of reading the messages for validation lingered even after the validation is done.
Namely $message_id gets assigned if exists in the input files but the variable is global, and it is not cleared before pre_process_file runs.
This causes reading a message without a message-id followed by reading a message with a message-id to misbehave---the sub reports as if the message had the same id as the previously written one.

Clear the variable before starting to read the headers in pre_process_file.

You can see in the test for extracting patches:

threaded_patches=$(git format-patch -o threaded --thread=shallow -s --in-reply-to="format" HEAD^1)

See also Git 2.42 (Q3 2023) and a leak fix: [commit c6d26a9](https://github.com/git/git/commit/c6d26a9dda59043f95ee5bf632d6aa80fa50aad7), [commit cfa1209](https://github.com/git/git/commit/cfa120947e6337e7be2658f71a0132e337ee090a) (18 May 2023) by [Jeff King (`peff`)](https://github.com/peff). ^{(Merged by [Junio C Hamano -- `gitster` --](https://github.com/gitster) in [commit e490bea](https://github.com/git/git/commit/e490bea8a68fceb227277fef38daa1b7522b97a5), 13 Jun 2023)} — VonC, Jun 16 '23 at 08:23

Why is "git format-patch" between two commits different from "git diff"?

3 Answers3

Long

`format-patch`: use '`origin`' as start of current-series-range when known

`format-patch`: correct documentation of --thread without an argument

`send-email`: clear the `$message_id` after validation

Why is "git format-patch" between two commits different from "git diff"?

3 Answers3

Long

format-patch: use 'origin' as start of current-series-range when known

format-patch: correct documentation of --thread without an argument

send-email: clear the $message_id after validation

`format-patch`: use '`origin`' as start of current-series-range when known

`format-patch`: correct documentation of --thread without an argument

`send-email`: clear the `$message_id` after validation