Wikipedia, sadly... keeps every single revision in the database in some form of XML(?) as text.
Take a look at the wikipedia database schema. Specifically recent changes and text.
Hence, they have wonderful O(1) lookups to the first copy of the "biology" page. This has the unfortunate side effect of causing wikipedia's technology cost to balloon from $8mil USD in 2010-2011 to $12mil USD in 2011-2012. This is despite HDDs (and everything else) getting cheaper, not more expensive.
So much for revision control of keeping every file. Git takes a cute approach. See Is the git storage model wasteful?.
It stores every file, similar to the above method. Once the space taken by the repo exceeds a certain limit, it does a brute force repack(theres an option to set how hard it tries - --window=[N], --depth=[N] ) which may take hours. It uses a combination of delta and lossless compression for the said repack (recursively delta, then apply lossless on whatever bits you have).
Others like SVN use simple delta compression. (from memory, which you should not trust).
Footnote:
delta compression stores incremental changes.
lossless compression is pretty much like zip, rar, etc.