-1

Let’s say I have some git repository with files and many commits. If I do:

git reset --soft $some_commit

then modify one line and do

git add file
git commit -m message
git push --force

I see that the size of the files in the .git folder is actually increasing quite a lot (compared to how it was before the reset), despite the files not being actually heavier. Was wondering what’s going on there. I though a soft reset would revert the .git internal files, but it seems the commits are not actually being deleted. Am I missing something?

anymous.asker
  • 1,179
  • 9
  • 14
  • The size of *which* files in the `.git` folder? Be specific. – torek Feb 13 '19 at 19:29
  • Without any specifics, it sounds like you are making changes, and those will have to be recorded within the git files. Can you give more details about which files are growing and what kind of growth you are seeing (quite a lot may mean several gigs to me and several bytes to others). – dmoore1181 Feb 13 '19 at 19:37
  • I'll provide more info about the files that grow, but about the file sizes: let's say I revert back 1 commit that modifies two lines, then make a new commit that in addition modifies one extra line. Non-git files amounted to ~300kb, repo size increases by roughly the same amount after force-pushing, with all the increase happening inside the `.git` folder. – anymous.asker Feb 13 '19 at 19:51

1 Answers1

2

To illustrate what's happening, let's say you have five commits.

A - B - C - D - E [master]

Then you reset back to C.

$ git reset --soft C
A - B - C [master]
         \
          D - E

The reverted commits are still there in your local repository. Resetting does not delete them, but nothing references them. If they're still unreferenced in a couple weeks they'll be garbage collected.

Then you make a new commit.

$ git commit
A - B - C - F [master]
         \
          D - E

Again, the old commits are still there.

Conceptually Git stores the entire changed file, not just the diff. If you make one tiny change to a large file .git might grow by the size of the whole file as Git stores a new copy. But Git will eventually compress its database to reduce the size. If you're impatient you can run git gc. Generally Git storage is extremely efficient.

Pushing has no effect on your local repository.


Those commits are not totally unreachable. You can still access them from the git reflog and put a new tag or branch on them. For example, if you realized you made a mistake and want to go back you can move master back to where it was.

$ git reset --hard E
A - B - C - F
         \
          D - E [master]

There's also ORIG_HEAD. This is a special label that is set on where you moved from. Back in the original git reset --soft C ORIG_HEAD would still be on E.

$ git reset --soft C
A - B - C [master]
         \
          D - E [ORIG_HEAD]

And you an return there.

$ git reset ORIG_HEAD
A - B - C
         \
          D - E [master]

It works this way both for Git to be more efficient, disk is cheap and it doesn't have to optimize its storage on every change, and to allow you to change your mind.

If you want to get rid of all unreachable objects, you can run git gc --prune=all. Don't do this unless you are really, really short on disk space. Usually running git gc is sufficient to get git to compress and pack .git.

Schwern
  • 153,029
  • 25
  • 195
  • 336
  • 2
    Thanks for the detailed answer. Had no idea git did its own garbage collection. – anymous.asker Feb 13 '19 at 20:06
  • @anymous.asker If you want to dive in further, [the Git internals are very exposed and well documented](https://github.com/pluralsight/git-internals-pdf). Understanding how Git works is critical to using it well. – Schwern Feb 13 '19 at 20:30