A SHA-1 hash is 160 bits long. That gives you 2160, or exactly
1,461,501,637,330,902,918,203,684,832,716,283,019,655,932,542,976
possible hashes.
Assuming hash values are more or less unpredictable, the odds of two files accidentally having the same hash are infinitesimal to the point that it's just not worth worrying about it.
Quoting from Scott Chacon's book "Pro Git":
However, you should be aware of how ridiculously unlikely this
scenario is. The SHA–1 digest is 20 bytes or 160 bits. The number of
randomly hashed objects needed to ensure a 50% probability of a single
collision is about 280.
...
Here’s an example to give you an idea of what it would take to get a
SHA–1 collision. If all 6.5 billion humans on Earth were programming,
and every second, each one was producing code that was the equivalent
of the entire Linux kernel history (1 million Git objects) and pushing
it into one enormous Git repository, it would take 5 years until that
repository contained enough objects to have a 50% probability of a
single SHA–1 object collision. A higher probability exists that every
member of your programming team will be attacked and killed by wolves
in unrelated incidents on the same night.
It's true that there must be two 21-byte files that have the same SHA-1 hash (since there are 2168 such files and only 2160 possible SHA-1 hashes). No such files have ever been discovered.
UPDATE : As of February 2017, two distinct PDF files with identical SHA-1 checksums have been generated, using a technique that's more than 100,000 times as fast as a brute force attack. Details here: https://security.googleblog.com/2017/02/announcing-first-sha1-collision.html
Linux Torvalds (the author of Git) has posted a (preliminary) response here: http://marc.info/?l=git&m=148787047422954
Looking at the comments, it seems that the OP's original misunderstanding was an assumption that the SHA-1 hash could be used to determine the contents of the file. It can't. Git uses the SHA-1 has to construct the name of the file or other object. The file itself is stored somewhere under the .git/objects
directory. For example, a file with a hash of
ff5a5eff8c90da934937165c9d0e9f96f9ecaf75
might be stored in
.git/objects/ff/5a5eff8c90da934937165c9d0e9f96f9ecaf75
-- and that file can be arbitrarily large. (It's not that simple, of course; git plays a lot of tricks to combine similar file and otherwise compress data.) Thanks to Patrick Schlüter for his comment.