41

Git mysteriously runs Garbage collection "from time to time" and deletes and orphaned commits you have.

https://www.kernel.org/pub/software/scm/git/docs/git-gc.html

Scientifically, this will occur approximately 6-8 hours before you realize you really needed that commit which was just deleted.

I'd rather not have my files deleted by Git. How can I disable automatic garbage collection altogether?

Code Whisperer
  • 22,959
  • 20
  • 67
  • 85
  • 6
    In standard configuration, it will only delete commits that have been orphaned (i.e. not been accessible from the history of any branch) for more than 90 days. It doesn't "delete your files". It stops preserving things that *you* have deleted months ago. – Sven Marnach Jan 22 '15 at 15:51
  • 2
    A lot of the time I might unwittingly destroy the path to a commit and orphan it by doing a rebase. My intent isn't really to delete this info. – Code Whisperer Jan 22 '15 at 15:58
  • 5
    Fair enough -- that's a reasonable preference. I personally prefer a workflow that simply keeps things I might still need in the history. I would suggest increasing `gc.reflogexpire` and friends as well for your use case, since this will make finding abandoned commits easier. It is worth noting that never running garbage collection might decrease git's performance. – Sven Marnach Jan 22 '15 at 16:10
  • 2
    Even after a rebase, your old commits are preserved thanks to the [reflog](http://git-scm.com/docs/git-reflog). To give yourself an easier out, before your rebase run `git checkout -b mulligan`. – Greg Bacon Jan 22 '15 at 16:12
  • 1
    @GregBacon Doing this will successfully prevent commits from being marked as garbage – Code Whisperer Jan 22 '15 at 16:14
  • 1
    Plus one for "Scientifically, this will occur approximately 6-8 hours before..." – LarsH Feb 27 '18 at 14:09
  • My Waferfish here tells me you misspelled "Scientologically" there, as "Scientifically". – Armen Michaeli Aug 25 '21 at 09:43

3 Answers3

53

From the very same page you just linked to:

Some git commands may automatically run git gc; see the --auto flag below for details. If you know what you’re doing and all you want is to disable this behavior permanently without further considerations, just do:

$ git config --global gc.auto 0
Community
  • 1
  • 1
SLaks
  • 868,454
  • 176
  • 1,908
  • 1,964
  • Well ain't that the ticket – Code Whisperer Jan 22 '15 at 15:30
  • All it tells me is "bash: git: command not found" Using git extensions – Hatchling Aug 24 '16 at 19:05
  • 13
    @Hatchling: Sounds like you need to install git. – SLaks Aug 25 '16 at 14:54
  • 1
    @SLaks I had to use the default bash console provided by git itself. The GitExtensions console seems to not recognize some commands. – Hatchling Aug 25 '16 at 21:31
  • 2
    What the quoted paragraph does *not* tell us is whether this is the *only* mechanism by which gc happens. The OP says "Git mysteriously runs Garbage collection 'from time to time'", giving the impression that gc may also happen *without* being triggered by the user issuing any git commands. Assurance that this doesn't actually happen would fill in the gap in this answer. – LarsH Feb 27 '18 at 14:58
11

2023: for specific references, git config gc.recentObjectsHook ./precious-objects can help (Git 2.42+)


2015: Another approach, recently documented with git config gc.xxx: "now" and "never" for 'expire' settings

In addition to approxidate-style values ("2.months.ago", "yesterday"), consumers of 'gc.*expire*' configuration variables also accept and respect 'now' ("do it immediately") and 'never' ("suppress entirely").

See commit 8cc8816 (28 Jul 2015) by Eric Sunshine (sunshineco).
Suggested-by: Michael Haggerty (mhagger).
(Merged by Junio C Hamano -- gitster -- in commit 8cc8816, 28 Jul 2015)

That means this would also prevent any gc:

git config --global gc.pruneExpire never
git config --global gc.reflogExpire never

However, you may encounter (if you use configuration value never):

warning: There are too many unreachable loose objects; run 'git prune' to remove them. 

In that case, you probably want to set gc.auto to some high value (e.g. 100000) if you really do not want to expire anything. That will silence the warning but may cause garbage collection to be less effective overall so this should be considered as a workaround, not a real fix. See Is it possible to get `git gc` to pack reflog objects? for additional details.


To avoid git gc only in background, set, as in nornagon's answer:

git config --global gc.autodetach false

That comes from Git v2.14.0-rc1 commit c45af94 (11 Jul 2017) by Jeff King (peff).
(Merged by Junio C Hamano -- gitster -- in commit 764046f, 18 Jul 2017)

We run an early part of "git gc" that deals with refs before daemonising (and not under lock) even when running a background auto-gc, which caused multiple gc processes attempting to run the early part at the same time.
This is now prevented by running the early part also under the GC lock.

gc: run pre-detach operations under lock

We normally try to avoid having two auto-gc operations run at the same time, because it wastes resources.
This was done long ago in 64a99eb (gc: reject if another gc is running, unless --force is given, 2013-08-08, v1.8.5-rc0).

When we do a detached auto-gc, we run the ref-related commands before detaching, to avoid confusing lock contention.
This was done by 62aad18 (gc --auto: do not lock refs in the background, 2014-05-25, Git v2.0.1).

These two features do not interact well.
The pre-detach operations are run before we check the gc.pid lock, meaning that on a busy repository we may run many of them concurrently.
Ideally we'd take the lock before spawning any operations, and hold it for the duration of the program.

This is tricky, though, with the way the pid-file interacts with the daemonize() process.
Other processes will check that the pid recorded in the pid-file still exists. But detaching causes us to fork and continue running under a new pid.
So if we take the lock before detaching, the pid-file will have a bogus pid in it. We'd have to go back and update it with the new pid after detaching.
We'd also have to play some tricks with the tempfile subsystem to tweak the "owner" field, so that the parent process does not clean it up on exit, but the child process does.

Instead, we can do something a bit simpler: take the lock only for the duration of the pre-detach work, then detach, then take it again for the post-detach work.

Technically, this means that the post-detach lock could lose to another process doing pre-detach work.
But in the long run this works out.

That second process would then follow-up by doing post-detach work. Unless it was in turn blocked by a third process doing pre-detach work, and so on.

This could in theory go on indefinitely, as the pre-detach work does not repack, and so need_to_gc() will continue to trigger.
But in each round we are racing between the pre- and post-detach locks.
Eventually, one of the post-detach locks will win the race and complete the full gc.

So in the worst case, we may racily repeat the pre-detach work, but we would never do so simultaneously (it would happen via a sequence of serialized race-wins).

VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
  • 2
    This is much better that setting `gc.auto` to zero because `gc` will e.g. automatically repack the objects to improve performance. You really want gc but you do not want to expire stuff, which is exactly these configuration parameters do. – Mikko Rantalainen Oct 25 '18 at 07:43
  • However, you may encounter `warning: There are too many unreachable loose objects; run 'git prune' to remove them.` if you use configuration value `never`. In that case you probably want to set `gc.auto` to some high value (e.g. 100000) if you really do not want to expire anything. – Mikko Rantalainen Mar 07 '19 at 11:58
  • @MikkoRantalainen Thank you. I have included your comment in the answer for more visibility. – VonC Mar 07 '19 at 12:30
  • It might we nice to mention that increasing `gc.auto` is just a workaround. If I understand the code in `gc.c` of git source correctly, high `gc.auto` value may prevent automatic gc (including compressing normal stuff) to be skipped, too. See also: https://stackoverflow.com/q/55043693/334451 – Mikko Rantalainen Mar 07 '19 at 12:57
  • @MikkoRantalainen by all means, do edit this answer to include what you deem relevant. – VonC Mar 07 '19 at 14:01
  • I feel like people would want `gc.autodetach false` to be a global config setting - just like the pruneExpire etc... Going to edit answer, but please change back if you disagree – Devin Rhode Aug 05 '21 at 17:36
  • @DevinRhode Good edit, thank you. I just reformatted certain elements in my old answer. – VonC Aug 05 '21 at 17:55
  • I have `auto = 1000000000`, `reflogExpire = never`, `reflogExpireUnreachable = never`, `autodetach = false` and yet git still tried to do its packing. What's going on? – user541686 Apr 30 '23 at 17:29
  • @user541686 What version of Git are you using? On which OS? – VonC Apr 30 '23 at 21:30
  • @VonC: I'm running git-for-windows 2.37.3 on WSL (which is basically Linux)... but surely this isn't version- or OS-dependent? – user541686 May 01 '23 at 05:52
  • @user541686 Would the issue persist with Git for Windows 2.40.1? – VonC May 01 '23 at 09:11
  • VonC: Uhh I'm not sure, I'd have to check. Not entirely sure how to trigger it, it's rare so it might be months before I know. Thanks for the tip though, I'll try it out. – user541686 May 01 '23 at 21:44
  • @VonC: Wait, I just checked the website and it seems 2.40.1 is just... the latest version? Do you actually have a reason to believe that version is any different? or did you just ask me to install the latest version out of habit? – user541686 May 01 '23 at 23:25
  • @user541686 Habit, but also because [I monitor Git commits for the past decade](https://stackoverflow.com/search?tab=newest&q=user%3a6309%20%22with%20Git%22&searchOn=3), and I know gc (garbage collection) has been impacted with [Git 2.38](https://stackoverflow.com/a/73271114/6309). – VonC May 02 '23 at 06:22
  • how to re-enable gc auto above ? – Dimas Lanjaka Aug 24 '23 at 22:44
  • @DimasLanjaka As I [mentioned here](https://stackoverflow.com/a/75863164/6309), you could simply unset `gc.auto`, restoring its default value. – VonC Aug 25 '23 at 09:38
  • looks like `git config --global gc.auto 1` ? – Dimas Lanjaka Aug 25 '23 at 12:18
  • @DimasLanjaka I was more thinking about `git config --global --unset gc.auto` – VonC Aug 25 '23 at 12:29
2

I found this answer because I was trying to prevent git from running git gc in the background, because it was messing with other operations I was trying to perform on the repo. It turns out you can specifically disable the backgrounding behaviour with

$ git config gc.autodetach false

If you want this behavior on all repos:

$ git config --global gc.autodetach false
Devin Rhode
  • 23,026
  • 8
  • 58
  • 72
nornagon
  • 15,393
  • 18
  • 71
  • 85