Consider this simple python code, which demonstrates a very simple version control design for a dictonary:
def build_current(history):
current = {}
for action, key, value in history:
assert action in ('set', 'del')
if action == 'set':
current[key] = value
elif action == 'del':
del current[key]
return current
history = []
history.append(('set', '1', 'one'))
history.append(('set', '2', 'two'))
history.append(('set', '3', 'three'))
print build_current(history)
history.append(('del', '2', None))
history.append(('set', '1', 'uno'))
history.append(('set', '4', 'four'))
print build_current(history)
for action, key, value in history:
if key == '2':
print '(%s, %s, %s)' % (action, key, value)
Notice that by using the history list you can reconstruct the current dictionary in any state it once existed. I consider this a "forward build" (for lack of a better term) because to build the current dictionary one must start at the beginning and process the entire history list. I consider this the most obvious and straight forward approach.
As I've heard, early version control systems used this "forward build" process, but they were not optimal because most users care more about recent versions of a build. Also, users don't want to download the entire history when they only care about seeing the latest build.
My question then is, what other approaches exist for storing histories in a version control system? Perhaps a "backwards build" could be used? That might allow users to only download recent revisions without needing the entire history. I've also seen a few different formats for storing the history, namely: changesets, snapshots, and patches. What are the differences between changesets, snapshots and patches?
Of the modern popular version controls available, how do they store their histories and what are the advantages of their various designs?