14

Question Update

(n.b. I've accepted Roland's answer, as it is indeed the correct (and simplest) solution starting from git 1.7.4.4, but please consider this question open regarding earlier versions of git down to 1.7.0.4.)

This question is a bit rambling (primarily due to the edits resulting from my subsequent attempts to establish more information on the situation), but the text in the title is the most important bit.

That is: I'm trying to establish the definitive way to ensure that all git commands will display full (un-abbreviated) hashes in their output.

As I am focussed on backwards-compatibility, this needs to cover older versions of git 1.7. Ideally the solutions would work for git 1.7.0.4 (which is used in the still-supported Ubuntu 10.04 LTS), but I'd be happy with a minimum of 1.7.2.5 (for Debian 6 / Squeeze LTS). Anything requiring a version later than 1.7.9.5 (Ubuntu 12.04 LTS) is definitely not ideal, but I'd still love to hear about them.

Please note that I do not wish to lose the ability to have abbreviated hashes -- the purpose behind this question is to ensure that tools interacting with git can always access a complete and unambiguous hash. When I use git manually on the command line I am going to want the normal abbreviations most of the time.

Roland Smith's suggestion of utilising a command-line argument override for core.abbrev looked ideal, but sadly only works since v1.7.4.4 (as core.abbrev did not previously exist). I suspect this means we do need to determine the most comprehensive set of command-specific arguments (such as git blame -l) to produce the equivalent effect.

Original Question with Edits

Some (most?) git commands default to outputting abbreviated hashes. For instance both git blame and git-annotate do this, and this fact was tripping up the current Emacs support when clashes arose (as they can do prior to git 1.7.11.1 -- see Edit 1 below), as the ambiguous hashes then caused errors when attempting to act upon them).


Begin Edit 1

I note the following in the Changelog, which suggests that the original problem which prompted this question issue would not arise in more recent versions of git.

Fixes since v1.7.11.1
---------------------
 * "git blame" did not try to make sure that the abbreviated commit
   object names in its output are unique.

If it's the case that git is supposed to guarantee uniqueness (at least at the time the command is run) for all object names reported by any git command, then that would significantly alleviate my concerns; but obviously a solution to the question which supports earlier versions of git is still going to be of interest.

End Edit 1


That can be fixed with git blame -l and git annotate -l, but I don't know whether these two commands are isolated cases or not, and I want to ensure that this issue can't arise in other situations.

The only related configurations I can see are core.abbrev:

Set the length object names are abbreviated to. If unspecified, many commands abbreviate to 7 hexdigits, which may not be enough for abbreviated object names to stay unique for sufficiently long time.

(but I don't want to remove the option of seeing an abbreviated commit), and log.abbrevCommit which:

If true, makes git-log(1), git-show(1), and git-whatchanged(1) assume --abbrev-commit. You may override this option with --no-abbrev-commit.

The --no-abbrev-commit argument isn't a consistent thing, though -- I presume that only the commands mentioned in that quote recognise it (but see Edit 2 below).


Begin Edit 2

The parse-options API document states:

Boolean long options can be negated (or unset) by prepending no-, e.g. --no-abbrev instead of --abbrev. Conversely, options that begin with no- can be negated by removing it.

So the commands which accept --abbrev (of which there are many) will in fact all accept --no-abbrev as well? This negated option is often not mentioned; although --abbrev=40 would currently be equivalent, of course, even if no negation was available).

It's not clear to me when the default boolean negation option feature was introduced, however.

In my version 1.7.9.5 git-blame --no-abbrev results in single-character object names. In fact it's the same as --abbrev=0, as blame uses n+1 characters. Conversely I notice that git branch -v --abbrev=0 gives the full 40 characters.

End Edit 2


A complete list of the potential problem commands with their appropriate options would be excellent, although the ideal solution would be something that would (or at least should) be respected by all git commands (including future commands), but maintains the ability to display abbreviated hashes when desired?

An ugly approach which occurred to me was to create a git config file which imports the original config file (although I note that importing is only available from 1.7.10) and then sets core.abbrev to 40; and to use this via a temporary GIT_CONFIG environment variable when invoking git, whenever full commits are a necessity. I guess this would work, but I'd rather not do it.

Clearly there are/were bugs, and some of the bugs at least have since been fixed; but as the aim is supporting as many (as practical) versions of git that a user might reasonably happen to be running, I'm looking for something which is backwards-compatible.

For what it's worth, here's what I've gleaned from grepping the manual for version 1.7.12.4:

Commands accepting --abbrev (and thus in theory also --no-abbrev):

  • blame
  • branch
  • cli
  • describe
  • diff
  • diff-index
  • diff-tree
  • log
  • ls-files
  • ls-tree
  • rev-list
  • rev-parse
  • show-ref

Other options:

  • git annotate -l
  • git blame -l
  • git diff --full-index
  • git log --no-abbrev-commit
  • git show --no-abbrev-commit
  • git whatchanged --no-abbrev-commit
phils
  • 71,335
  • 11
  • 153
  • 198
  • 1
    You're trying to integrate git into Emacs, is that correct? This looks like an interesting question. Consider putting a hefty bounty on it. –  Apr 27 '14 at 06:13
  • 2
    It's more that I'm trying to work around a potential problem for the *existing* support in Emacs (which is very good, especially if the superb [Magit](http://magit.github.io/) library is included); but Emacs is only my focus because that's the tool I happen to use; I think this may be of general interest for tools integrating with Git. – phils Apr 27 '14 at 07:02
  • Note: for an interactive rebase, `git -c core.abbrev=40` wouldn't change anything (starting Git 2.3.1+). See [my answer below](http://stackoverflow.com/a/28578733/6309) – VonC Feb 18 '15 at 08:04
  • 1
    With Git 2.31+ (Q1 2021), ant `git -c core.abbrev=no xxx` Git command would use only full hash, be it SHA1 or future SHA2. See [my answer below](https://stackoverflow.com/a/65756797/6309) – VonC Jan 17 '21 at 02:36

4 Answers4

15

Using git -c core.abbrev=40 <command> is supposed to work on all commands because it "will override whatever is defined in the config files".

It seems to have been introduced in 8b1fa778676ae94f7a6d4113fa90947b548154dd (landed in version 1.7.2).

Edit2: As phils noticed, the core.abbrev parameter was added in 1.7.4.4.

Edit: W.r.t. hardcoded hash lengths, you could always look up the hash lengths by looking at the filename lengths in .git/objects/* when initializing your program/library.

Roland Smith
  • 42,427
  • 3
  • 64
  • 94
  • Well I guess that's me not seeing the wood for the trees :) I do wish there was a solution which didn't hard-code the limit of 40 characters (I think I read that enabling longer hashes is in the roadmap), but in practice I'm sure that's not going to be a noteworthy problem. I'll do some testing sometime during the week to verify. – phils Apr 27 '14 at 12:51
  • I was going to suggest using a sufficiently huge value, but at least on 1.9.0 using any value >40 gives a fatal error. – Roland Smith Apr 27 '14 at 12:59
  • 1
    @RolandSmith is using more than 40 characters even possible? I thought sha1 hashes were `160 bits / 4 bits/hex = 40 hex` characters [by definition](http://en.wikipedia.org/wiki/SHA-1)? –  Apr 28 '14 at 18:54
  • 1
    @Cupcake You are right about SHA1. But Phils mentioned in a comment above that "enabling longer hashes is in the roadmap". As soon as a longer hash is used, the number 40 has to go up if you want to see the full hash. – Roland Smith Apr 28 '14 at 19:22
  • 1
    I'm sad to report that under git 1.7.2.5 (from a fresh install of Debian 6, and a new test repository), `git -c core.abbrev=40 blame ` produces the same abbreviated commit hashes as `git blame `. When was `core.abbrev` added, I wonder? – phils Apr 29 '14 at 04:05
  • Yes, `core.abbrev` first appears in v1.7.4.4 by the looks of it. Sigh. – phils Apr 29 '14 at 04:25
  • @phils Bummer. But surely updating git isn't _that_ hard? – Roland Smith Apr 29 '14 at 18:00
  • Yeah, it's disappointing :/ But the goal is to support whichever existing version of git a user might reasonably have, and the package repositories for those older Long Term Support distributions of Debian and Ubuntu do not contain newer versions of git; so although people running those LTS distros *can* obtain a backport from a newer version if they wish to (or compile it for themselves), it's not something that they get by default when they update the OS. – phils Apr 29 '14 at 22:06
  • Note that `core.abbrevlength` never existed in a release, IIRC. The feature was initially added with that name, but shortened to `core.abbrev` before being released. – phils May 04 '14 at 04:24
  • @phils: I had trouble finding `core.abbrevlength` in the repo. No I know why. – Roland Smith May 04 '14 at 10:39
  • Thanks for your help with this, Roland. I've accepted the answer, as it certainly seems like the correct solution when core.abbrev exists, which hopefully covers most people. – phils May 05 '14 at 22:32
  • Using git 2.16.4, I find that `git diff --abbrev=n` only affects the extra `--raw` diff output. But if I use `git -c core.abbrev=n diff` the entire diff output is affected – Mort Aug 27 '18 at 15:19
2

The new updated answer (2021) will be with Git 2.31 (Q1 2021)

The configuration variable 'core.abbrev' can be set to 'no' to force no abbreviation regardless of the hash algorithm.

And that will be important when Git will switch from SHA1 to SHA2.

See commit a9ecaa0 (01 Sep 2020) by Eric Wong (ele828).
(Merged by Junio C Hamano -- gitster -- in commit 6dbbae1, 15 Jan 2021)

core.abbrev=no: disables abbreviations

Signed-off-by: Eric Wong

This allows users to write hash-agnostic scripts and configs by disabling abbreviations.

Using "-c core.abbrev=40" will be insufficient with SHA-256, and "-c core.abbrev=64" won't work with SHA-1 repos today.

[jc: tweaked implementation, added doc and a test]

git config now includes in its man page:

If set to "no", no abbreviation is made and the object names are shown in their full length.

VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
  • Thank you for your continued follow-ups to this question. I'd have up-voted them in the past, but I have no recollection of Stack Overflow ever telling me that you'd added them! As it is, I'm up-voting only this latest Answer in order to float it above the others, as this does sound like the comprehensive solution for the future. Cheers! – phils Jan 17 '21 at 10:15
  • No problem. This is indeed the most interesting option out of my three answers. – VonC Jan 17 '21 at 10:20
0

Note: using git -c core.abbrev=x rebase -i works well for the editor (which will show abbreviated commit SHA1)

BUT: it was also using that same abbreviated SHA1 internally for start of the rebase itself.

That won't be needed anymore (meaning git -c core.abbrev=40 rebase -i is not needed at all).

See commit edb72d5 from Kirill A. Shutemov, for Git 2.3.1+ (Q1/Q2 2015):

rebase -i: use full object name internally throughout the script

In earlier days, the abbreviated commit object name shown to the end users were generated with hardcoded --abbrev=7; commit 5689503 (rebase -i: respect core.abbrev, 2013-09-28, Git 1.8.5+) tried to make it honor the user specified core.abbrev, but it missed the very initial invocation of the editor.

These days, we try to use the full 40-hex object names internally to avoid ambiguity that can arise after rebase starts running.
Newly created objects during the rebase may share the same prefix with existing commits listed in the insn sheet.
These object names are shortened just before invoking the sequence editor to resent the insn sheet to the end user, and then expanded back to full object names when the editor returns.

But the code still used the shortened names when preparing the insn sheet for the very first time, resulting "7 hexdigits or more" output to the user.

Change the code to use full 40-hex commit object names from the very beginning to make things more uniform.

Note: for an interactive rebase, the "insn sheet" is the instruction sheet. See commit 3322ad4 for illustration.

VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
0

Note that with Git 2.12 (Q1 2017), you can add git diff --no-index to the list of commands with --no-abbrev:

See commit 43d1948 (06 Dec 2016) by Jack Bates (jablko).
(Merged by Junio C Hamano -- gitster -- in commit c89606f, 19 Dec 2016)

diff: handle --no-abbrev in no-index case

There are two different places where the --no-abbrev option is parsed, and two different places where SHA-1s are abbreviated.
We normally parse --no-abbrev with setup_revisions(), but in the no-index case, "git diff" calls diff_opt_parse() directly, and diff_opt_parse() didn't handle --no-abbrev until now. (It did handle --abbrev, however.)
We normally abbreviate SHA-1s with find_unique_abbrev(), but commit 4f03666 ("diff: handle sha1 abbreviations outside of repository, 2016-10-20) recently introduced a special case when you run "git diff" outside of a repository.

setup_revisions() does also call diff_opt_parse(), but not for --abbrev or --no-abbrev, which it handles itself.
setup_revisions() sets rev_info->abbrev, and later copies that to diff_options->abbrev. It handles --no-abbrev by setting abbrev to zero. (This change doesn't touch that.)

Setting abbrev to zero was broken in the outside-of-a-repository special case, which until now resulted in a truly zero-length SHA-1, rather than taking zero to mean do not abbreviate.
The only way to trigger this bug, however, was by running "git diff --raw" without either the --abbrev or --no-abbrev options, because

  1. without --raw it doesn't respect abbrev (which is bizarre, but has been that way forever),
  2. we silently clamp --abbrev=0 to MINIMUM_ABBREV, and
  3. --no-abbrev wasn't handled until now.

The outside-of-a-repository case is one of three no-index cases. The other two are when one of the files you're comparing is outside of the repository you're in, and the --no-index option.

VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250