20

Today, git started acting funny (well, funnier than usual) by insisting on running git gc after every single merge, even if they are back to back.

C:\Projects\my-current-project>git pull
remote: Counting objects: 31, done.
remote: Compressing objects: 100% (16/16), done.
remote: Total 16 (delta 11), reused 0 (delta 0)
Unpacking objects: 100% (16/16), done.
From git.company.com:git/
   e992ce8..6376211  mybranch/next -> origin/mybranch/next
Merge made by recursive.
Auto packing the repository for optimum performance. You may also run "git gc" manually. See "git help gc" for more information.
FIND: Parameter format not correct
Counting objects: 252732, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (59791/59791), done.
Writing objects: 100% (252732/252732), done.
Total 252732 (delta 190251), reused 252678 (delta 190222)
Removing duplicate objects: 100% (256/256), done.
 .../stylesheets/style.css                          |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

This is incredibly disruptive, and I fear that it means that my repository is corrupt somehow (this is the first time I've ever seen it automatically gc). Are my fears unfounded? If my repository is OK, how do I make the auto-packing stop?!

Dave Schweisguth
  • 36,475
  • 10
  • 98
  • 121
Mike Caron
  • 14,351
  • 4
  • 49
  • 77
  • Automatic gc does happen, but it generally only happens quite rarely. – asmeurer Aug 08 '12 at 05:04
  • 1
    @asmeurer The question was about it happening on every operation... – Mike Caron Aug 13 '12 at 16:19
  • I know. But you seemed to be surprised that automatic `gc` would happen at all, so I just wanted to point out that it does happen. – asmeurer Aug 13 '12 at 21:06
  • @asmeurer I was a git newbie at the time. My assumption had been that it runs GC once, and then it's done and doesn't have to do any more. Now I know better (eg, that git runs on a slightly different self-consistent model of reality ;) – Mike Caron Sep 11 '12 at 12:43
  • Ah OK. Garbage collection is, by definition, something that runs periodically (in any system). – asmeurer Sep 14 '12 at 21:06
  • 2
    Perhaps I am not conveying my self correctly. My garbage man comes to my street every week, he does not come every time I throw away an item. Similarly, GC should happen periodically, not every time I run a git command. The latter was happening, I was confused, now if it happened I would just sigh and say "oh git, you so crazy." – Mike Caron Sep 15 '12 at 01:50

4 Answers4

14

I'm adding this answer even though it doesn't answer the original poster's specific problem because every time one of my repos starts auto-packing after every merge I've forgotten the fix, search for it again and find this question first.

When one of my repos starts "Auto packing the repository for optimum performance" after every merge,

git gc --prune=now

fixes it. (Being on a Mac, I don't have the FIND: Parameter format not correct problem.) Right now I'm using git 2.4.1, but this has worked for me for several 2.* versions.

This answer to How to remove unreferenced blobs from my git repo suggests that one might need to clear one's reflog with

git reflog expire --expire-unreachable=now --all

for the above command to be maximally effective, but I've never needed to do that to fix auto-packing after every merge.

Community
  • 1
  • 1
Dave Schweisguth
  • 36,475
  • 10
  • 98
  • 121
14

EDIT

I think I spotted the problem.

You are probably running Cygwin/git or MsysGit on Windows. I noticed that because of the

FIND: Parameter format not correct

error message. The trouble is that somewhere your hook scripts (or git internally?!) is calling find, which does not find the UNIX (GNU) find utility but rather finds the Windows (MSDOS... sic) FIND.EXE.

You should be able to fix your system wide path. If that is not an option, explicitely specify the PATH environment variable inside your script (or before invoking them)


Old answer for information:

git gc --auto does not always result in any action taken; are you sure this is taking time every time, or did you just notice it is being called?

If it is being called every time, you might

  • check repository permissions (make sure it is fully writable to you!)
  • git fsck
  • git repack
  • git bundle --create mybundle.git --all and git clone mybundle.git to see whether somehow you can 'shake' the culprit
  • see whether you can upgrade to a later version
  • if all else fails, strace or debug the git-gc binary

Optionally, when you have shaken the culprit, you maybe able to analyze what is different between your 'cleaned' repo and the current one.

From the git-gc man-page:

With this option [--auto], git gc checks whether any housekeeping is required; if not, it exits without performing any work. Some git commands run git gc --auto after performing operations that could create many loose objects.

Housekeeping is required if there are too many loose objects or too many packs in the repository. If the number of loose objects exceeds the value of the gc.auto configuration variable, then all loose objects are combined into a single pack using git repack -d -l. Setting the value of gc.auto to 0 disables automatic packing of loose objects.

If the number of packs exceeds the value of gc.autopacklimit, then existing packs (except those marked with a .keep file) are consolidated into a single pack by using the -A option of git repack. Setting gc.autopacklimit to 0 disables automatic consolidation of packs.

RJFalconer
  • 10,890
  • 5
  • 51
  • 66
sehe
  • 374,641
  • 47
  • 450
  • 633
  • I'm not running any hook scripts (that I am aware of!), so it must be git internally. The FIND error was fixed by changing my PATH, but I'm running a manual `git gc` to see if that fixes it. – Mike Caron Sep 13 '11 at 12:58
  • 2
    Seems that the full `git gc` fixed it. I guess something happened that caused a whole lot of garbage in my repository, pushing it well above the "needs gc" threshold, and even the "how much to do in one auto gc" threshold. What that event was, I can only guess (since there's 6 people regularly writing to this repo), but it's stopped now, so I can actually get work done. I'm going to accept this answer based on the `FIND` suggestion, as well as making me think of running a full gc. – Mike Caron Sep 13 '11 at 14:02
  • I think another thing that could have caused it is some kind of bizarre operation (maybe a pre- or post-commit hook) that creates thousands of loose objects. Just for the record in case someone else has this problem. – asmeurer Aug 08 '12 at 05:05
6

What version of git are you using? Regardless, I find automatic gcing extremely disruptive.

git config --global gc.auto 0

Stefan Kendall
  • 66,414
  • 68
  • 253
  • 406
  • 7
    This is not a good idea. It happens automatically because no one is going to remember to run it manually, resulting in degrading speed of git over time in the repo, and increased filesystem usage. It happens quite rarely under normal circumstances, so it shouldn't be that disruptive. If it happens when you're doing something else, just open a new tab in your terminal and continue from there. – asmeurer Aug 13 '12 at 21:09
  • 1
    @asmeurer I agree with Stefan on this one... crontab -e with git gc --aggressive and you no longer have to remember to run it manually and can have it run during off hours so don't have to worry about getting hit with it at random times. – Hazok Oct 18 '12 at 13:06
  • 1
    Sure it's fine as long as you do run it manually, which was never suggested in the answer. – asmeurer Oct 18 '12 at 17:33
  • 3
    The correct solution to avoiding the disruption is not disabling auto-gc (for the reasons mentioned by others above). Instead, create a pre-auto-gc hook with this content: `echo "Git thinks it's time to run git gc."; exit 1` That way you get timely reminders, and only run git gc at a time that suits you, rather than randomly. – laszlok Jul 21 '17 at 12:56
  • +1, as I care less about why it's suddenly decided to run every time than I do about it not disrupting what I am trying to get on with. For now, it can be disabled. Performance is a small price to pay for lack of flow disruption. – RJFalconer Mar 06 '20 at 21:34
2

Your repository should have a lot of objects that the hashes start with 17. So this triggers git gc --auto. Default is at least 28 objects prefixed 17.

linquize
  • 19,828
  • 10
  • 59
  • 83