0

According to an answer of “How often should you use git-gc?”, git gc is run automatically every time you push to remote.

I commented on the answer but never got a response, so I’m asking here.

I have unreachable commits in my tree (as a result of git commit --amend). This can be verified with git log --reflog. I pushed a branch to the remote repository and checked my tree again; the unreachable commits were still there. Apparently git gc was not run when this push happened. … ?

Example:

$ git commit -m 'commit A'
$ git commit -m 'commit B'
$ git commit -m 'commit C'
$ git commit --amend -m 'commit D'
$ git commit -m 'commit E'
$ git commit -m 'commit F'
$ git push origin master
$ git log --reflog
* commit F (HEAD -> master, origin/master)
* commit E
* commit D (an amendment of C)
|
| * commit C
|/
* commit B
* commit A

When I push master to remote and run git log --reflog, commit C is still visible. This is still the case even if commit C is over 30 days old. I thought git push automatically runs git gc, and I thought git gc deletes the unreachable commits (in this case, C). Am I missing something?

chharvey
  • 8,580
  • 9
  • 56
  • 95

3 Answers3

0

After some research on the internet I found that according to this answer on stackoverflow:


Git decides whether to auto gc based on two criteria:

  1. Are there too many packs? (Literally, are there more than 50 files with .idx in .git/objects/pack?)
  2. Are there too many loose objects? (Literally, are there more than 27 files in .git/objects/17?)

If for some reason Git is not able to merge the pack files or remove the loose objects in that directory, it will think it needs to auto-gc again next time.


However based on what this book says we have that:

Git runs garbage collection automatically if:

• There are too many loose objects in the repository

• A push to a remote repository happens

• After some commands that might introduce many loose objects

• When some commands such as git reflog expire explicitly request it

but I couldn't find WHICH commands are actually triggering this command anywhere. My guess is that is not really defined by the specifications and it does depend on the implementation/version you are using.

Anyway this book states clearly that you are right and after a push this command is triggered, however I couldn't find a confirmation for this anywhere.

I instead found a in the git push documentation here an example where it says that unless you run a git gc after your git push there will be unreachable commits.

rakwaht
  • 3,666
  • 3
  • 28
  • 45
0

You are assuming that commits will be garbage collected immediately, but git will actually wait 30 days by default until it removes unreachable commits. This behaviour can be altered with the gc.reflogExpireUnreachable configuration setting.

Klas Mellbourn
  • 42,571
  • 24
  • 140
  • 158
  • but even unreachable commits over 30 days old are still not being deleted. I have commits over a year old still left in the repo! sorry, should have mentioned that in my question. – chharvey Oct 12 '17 at 15:41
0

You could do this to remove unreachable commits from the local repo

 git -c gc.reflogExpire=0 -c gc.reflogExpireUnreachable=0 -c gc.rerereresolved=0 \                                                                      (release-branch|✔) 17:49:31
    -c gc.rerereunresolved=0 -c gc.pruneExpire=now gc "$@"

See this answer for more information. I suspect your problem stems from git reflog itself keeping a reference to unreachable commits (thus making them "reachable")

Klas Mellbourn
  • 42,571
  • 24
  • 140
  • 158