As I said in a comment, the problem here boils down to linearizing. If you want a simple incrementing count to specify some particular commit, you must have a single source point that makes this simple incrementing count.
In SVN, there is an obvious place to do this: all commits are stored on a master central server. In order to make a new commit, you call up the central server and say: make a new commit. This either succeeds—and can get a simple, incrementing number—or it fails and there is no commit.
In Git, there is no designated central server. Each developer makes his or her own commits. Commits are exchanged between peers. The globally unique identifier for any given commit is its hash: Git guarantees that no two commits ever have the same hash.1
The lack of a single central counting point destroys the usefulness of making your own simple revision count, as different repositories can and will have the same number of commits without containing the same set of commits. I may have 17 commits, of which 2 are different from your 17 commits, so that if we combine our two repositories, we both wind up with 19 commits. (If I combine yours with mine, I get 19 commits—two new ones I get from you, plus the 15 we already shared—while you still have 17: you must still pick up the two commits I have that you lack.)
You can, however, use your idea: simply designate a central counting point:
One solution would be to maintain a translation table that lists each commit hash and maps it to a revision number but this makes life much harder.
It's not that much harder if you already have a central server. For instance, if any release build is done on the "release-build" system, and the release-build system has a Git repository, you simply designate its repository as the central counting point.
It maintains the table. The count could be the number of commits in its repository.2 But that's more than we need: The count can simply be the number of entries in the table; there is no need to count non-built releases. In any case, the translation from "count" to "hash", or vice versa, is done by looking up or adding the appropriate entry into the table.
The value of this simplified count is dubious at best. Look at real software releases, which are usually tagged with a "dotted version": Git version 2.8.4, Git version 2.9.0, Git version 2.10.1; Python 2.7.12, Python 3.4.5, and so on. How does 7.3.12 compare to 7.4.0? Is it strictly "less than", or not? With Git, when you build releases, you can tag them with dotted versions like this. The tag can be distributed using Git's built-in mechanisms, and everyone can look up v7.3.12
locally and find the commit. If you do not have the tag, you probably do not have the version: you must git fetch
, perhaps with --tags
, from someone who does.
The tags are, in effect, a distributed version of this central mapping table. Instead of counting the tags, though, we simply use their names, which have the form vX
or vX.Y
or whatever.
These tags can be extended with git describe, which lets you say "this many commits distant from this fixed tag, plus a unique verifier/locator in case distributed builds make the relative count break." See Sébastien Dawans' answer.
1This "guarantee" is kept via a simple mechanism: if two commits do have the same hash, Git simply refuses to believe that the second one exists. It won't accept it, it won't store it into the repository, and the existing hash "wins". The chances of this happening for any given pair of objects is vanishingly small: one out of 2N, where N is the number of bits in the hash. Since Git uses SHA-1 which is 160 bits, that's 2-160.
Due to the so-called birthday paradox or birthday problem, the probability rises rapidly with the number of objects. However, we start from such a small base that we can have trillions of objects, perhaps as many as 1.7 quadrillion or so, before the chance even rises to the same level as the chance of undetected storage-media corruption. (The names here use the "short scale"; see https://en.wikipedia.org/wiki/Quadrillion.)
2If you do use this approach (counting the number of commits in its repository), you must make sure you never drop any commits, or the count would go down and hence not act like an ascending function. This is one reason a count of table entries might be better; or you could use a separate counter that you never reset, with an atomic fetch-and-increment when choosing the next number.