The kind of trick you are talking about is called a preimage attack. Current techniques for generating deliberate SHA-1 collisions require that the bogus-duplicate's content contain a large "binary area"—basically a contiguous blob of bytes—where attacker can manipulate those bytes. PDF images are good candidates here because PDFs may contain such blocks.
Git commit and tag objects, however, do not contain such blocks. They do have an area in which one could drop a block like this, but this area shows us as the log message or tag message when you examine the commit (with git log
or git show
) or the tag (with git show
). It would be hard for a human to miss the fact that at the point the particular commit or tag was "blessed" as "okay to use", the message was something like:
Release version x.y
and now it's:
Release version x.y
filler filler filler ... filler
<random bytes to produce desired hash>
<this section goes on and on for many pages>
footer footer footer ... footer
An automated software system that doesn't bother looking at the commit message or tag message could be fooled, but it would be simple enough to add an entropy detector that notices that what's in the message here no longer matches the kind of data humans generate (which has relatively low entropy; see this blog entry on Shannon entropy and this IBM security document). That's a dead giveaway, and that computation can be automated.
(The message size will also have jumped from "tiny" to "relatively huge", which can be used as well, perhaps independently.)
Still, if you like, you can experiment with the new SHA-256 variant of Git. (You cannot mix variants though: you must either use SHA-1 only, or SHA-256 only. At least, that's the case today.)