Given a path name within a repository, and optionally a commit specifier, you can examine Git's unique ID for the file's content:
$ git hash-object -t blob Makefile
5a969f5830a4105d3e3e6236eaa51e19880cc873
$ git rev-parse :Makefile
5a969f5830a4105d3e3e6236eaa51e19880cc873
$ git rev-parse HEAD:Makefile
5a969f5830a4105d3e3e6236eaa51e19880cc873
(These three copies of the file are all identical, in this case. Makefile
is in the work-tree, :Makefile
is in the index, and HEAD:Makefile
is in the current commit.)
$ git rev-parse v2.1.0:Makefile
2320de592e6dbc545866e6bfef09a05f660c2c14
(The version of Makefile
committed in commit v2.1.0
is not the same as the three above.)
Note that although Git still uses SHA-1, this is not the same as the SHA-1 of the file's actual content:
$ sha1sum Makefile
857f75d0f314501dfdfcc5b6a4306eba1faddd31 Makefile
$ python
[python startup messages]
>>> import hashlib
>>> hashlib.sha1(open('Makefile', 'rb').read()).hexdigest()
'857f75d0f314501dfdfcc5b6a4306eba1faddd31'
This is because Git is checksumming the data after tacking on a header:
>>> data = open('Makefile', 'rb').read()
>>> hashlib.sha1('blob {}\0'.format(len(data)).encode('ascii') + data).hexdigest()
'5a969f5830a4105d3e3e6236eaa51e19880cc873'
Note, however, that if you add a header to a file, then checksum the resulting file, you'll get a new and different checksum because you're now checksumming the header plus the file data. If you store the new checksum into the file, and checksum the result, you'll get yet a third checksum. To avoid this problem of ever-changing checksums, you need either a weaker checksum—one where you can compute the right input to get a desired output (e.g., IP header style checksum)—or to checksum the data excluding the checksum itself. Or, of course, you can store the checksum outside the file, as Git does.
If you have some other source for unique identifiers, you can just generate them, rather than linking them to the file's content. How to do that is up to you.