1

The git version control system, is a kind of distributed log (with some conceptual similarities to the raft consensus protocol).

Raft and some other systems have a concept of log compaction, so new clients don't need to traverse the whole change set to apply changes.

My question is: Does git have a concept of log compaction?

hawkeye
  • 34,745
  • 30
  • 150
  • 304
  • 2
    Git doesn't use a change log so it doesn't need log compaction. But you can make _shallow clones_ don't have the commits beyond the last few. But if you need a complete clone, it is better if you make one than deepen a shallow clone. – Dan D. May 07 '16 at 02:32
  • Can you "compress out" deleted files? – hawkeye May 07 '16 at 07:21

1 Answers1

2

new clients don't need to traverse the whole change set to apply changes.

No, git is a collection of snapshots (full copy of a working tree).
When you access a commit in git, you don't have to traverse the all log or history to build its content.

See "How does git store files?": the internal storage does use delta in pack files (that is form of "compaction", not just "log compaction"), but each commit still represents the full working tree.

https://i.stack.imgur.com/AQ5TG.png

Every time you commit, or save the state of your project in Git, it basically takes a picture of what all your files look like at that moment and stores a reference to that snapshot.
To be efficient, if files have not changed, Git doesn’t store the file again—just a link to the previous identical file it has already stored.

Community
  • 1
  • 1
VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
  • So do deleted files remain in the change set or can you "log compact" them out? – hawkeye May 07 '16 at 07:20
  • @hawkeye no the deleted files are no longer part of the current commit, but they are part of past commits in the history of the git repo. – VonC May 07 '16 at 07:39
  • @hawkeye Remeber git store contents, not files. If you have two files with identical content, their content is only stored once. And if you delete one file, the tree including that deleted file is modified to not list said deleted file anymore. But the *content* remains (referenced by the other files). – VonC May 07 '16 at 07:41
  • Ok - is there a way to compress the log to remove that content? – hawkeye May 07 '16 at 09:19
  • 1
    @hawkeye there is no "log" to compress. Packfiles are already compressed (http://stackoverflow.com/a/9478566/6309). All you can do is getting rid of an element you don't want anymore in the history of a repo with `git filter-branch` (https://git-scm.com/docs/git-filter-branch) – VonC May 07 '16 at 09:25
  • "does use delta in pack files" - might want to say "might use delta in pack files", since many git repositories have no pack files. – matt May 07 '16 at 16:06
  • @matt "since many git repositories have no pack files"... unless `git gc` is run for any reason, or a git push is done (in order to transfer compressed compact files). – VonC May 07 '16 at 16:27