Git is very much oriented to the idea of adding new things (commits and their underlying objects) to the database, without ever removing any old things.
When you do manage to remove some old thing(s), if Git ever encounters them again, it sees them as new things and adds them back in. You can, if you like, think of this as getting "re-infected". Every copy of the repository that has the "infection" is "contagious", and touching any of them (via git fetch
or git push
) can bring back the objects you thought you had gotten rid of.
Now how do I find the commit/push that caused this?
Finding a particular fetch or push that caused it is difficult-to-impossible. Finding the commit(s) that contain the large objects is possible; see the answer you linked, and other links within it.
Also is there an easier way to revert the push to get the repo size back to normal?
You must ditch the commit(s) that contain the large objects, and if there are later commits that you wish to retain that depend on those earlier commits, copy the later commits to new, different commits that no longer depend on the earlier commits. This is what git filter-branch
does. Once you have no branch tip that either point to, or have in their commit ancestry chain, the commits that have the large objects, you can re-pack and shrink the repository.
The BFG Cleaner is much easier to use (it does all this for you), but I have never used it.
... how can I prevent these object being pushed back again by accident?
This is trickier. There are a number of approaches that work to varying degrees:
- Self-discipline by every person doing a push. Before pushing, everyone who pushes must make sure they will not be re-introducing the unwanted large objects. Obviously, this works only to the extent that people exercise it.
- Limit the set of people who are allowed to push. This reduces the above problem to a small number of people.
- Use Git hooks to verify that a requested update will not introduce either any large object, or any specific (known by hash ID) previous large object. This requires that you be able to install and maintain hooks on your Git service provider. If that provider is GitHub, you cannot do this, but they already include a "reject large objects" hook so you don't have to anyway.