-1

So this one's pretty basic. For reasons unknown, Git allows you to delete history. I've searched and searched and searched for a way to turn this off, and it seems there isn't one. Indeed, not only does Git allow you to delete history, but every tutorial I could find recommends that you regularly delete history, "to keep the graph clean". Deleting stuff isn't just possible, it's literally the recommended workflow! [insert horrified face here]

Given the above, it is 100% guaranteed that at some point somebody on my team (possibly even me) will accidentally delete something they didn't mean to, and then everything is ruined forever. Of course, if you realise you messed up, just don't push that to the central repo. You can just clone a new repo and delete the broken one. Problem solved. But what if you don't realise you did something wrong, and you end up pushing it? Now the central repo is broken, and nobody can fix it.

As far as I can tell, there are real, commercial companies doing real, mission-critical work using Git. So how do they "deal with" the abject lack of a safety net here? Surely they must have found a workaround for this. I can't imagine them going "oopsy, we just accidentally deleted 15 years worth of dev work. Oh well, never mind, eh?"

For context: I'm used to working with Mercurial. In that system, you can uncommit something after you committed it, but once it's been pushed to the central repo it's basically impossible to ever delete it. You can create a new commit that undoes whatever it did, but you cannot remove the original commit from history. In this model, no matter how badly you screw up the repo, you can always just revert to a time before you messed everything up. Heck, even if the central repo burns down somehow, just create a new, empty repo and have everyone push to it, and you're back in business again. Because every single repo is like a backup copy of your entire project history, and that history can never be damaged. Unfortunately, Git doesn't work anything like that. Everything can be deleted, and there's no undo.

isherwood
  • 58,414
  • 16
  • 114
  • 157
MathematicalOrchid
  • 61,854
  • 19
  • 123
  • 220
  • 1
    You should probably link to some of the articles you reference (especially when you quote from them). I'm not familiar with this practice. – isherwood Feb 02 '22 at 14:15
  • 1
    You can add server side hooks to apply rules on what a client can do to the remote. See https://git-scm.com/book/en/v2/Customizing-Git-Git-Hooks#_server_side_hooks – Andreas Louv Feb 02 '22 at 14:18
  • Maybe you refer to `git gc` which let's you clean up unreachable commits and files in your repository, but other than that I have never seen deleting content as a recommended practice. Maybe unused branches too, branches that have already been merged, but that doesn't really delete the branch, just the reference. – Jens Feb 02 '22 at 14:22
  • Agreed that removing history does not seem like a Git best practice, also that server-side hooks would be the safest method of protecting a remote repo. Bear in mind though, that platforms like Azure DevOps don't really implement server-side hooks. There are branch policies but those are more to require things like pull requests than to inspect the intent of a commit. – WaitingForGuacamole Feb 02 '22 at 14:26
  • To turn it off, you can set `gc.pruneExpire` to `never`. This does not prevent you from rebasing and otherwise rewriting the history of a branch, but it should prevent any commits from ever being deleted. I suspect the actual deletion of commits from the history is *not* what you are complaining about and this will not help you at all, but it should calm your fears. git does *not* encourage the deletion of history. Your commits are immutable and it is actually difficult to inadvertently delete anything you care about. If you simply tag a commit, the garbage collector will never delete it. – William Pursell Feb 02 '22 at 14:43
  • 1
    Backup. Protected branches. Hooks to prevent force-pushing. – phd Feb 02 '22 at 14:51
  • 1
    As long as there is any ref to a commit, the commit will not be deleted. I think your mental model is wrong. Don't think of the branch as being the history of the project. git maintains a graph of immutable commits and your branch is one view into that graph. You can change your branch to point to a different node in the graph, and the view of history from that point is different that the view point from somewhere else, but the old node still exists. – William Pursell Feb 02 '22 at 14:56
  • @WilliamPursell It seems that once the last reference to a commit is gone, technically the commit still *exists*, but now there's no way to access it. (Unless you somehow magically know the hashcode for it.) And a few days later, the GC will delete it for real. So you have an extremely small window for realising something is wrong and desparately trying to fix it. – MathematicalOrchid Feb 02 '22 at 15:05
  • "You can just clone a new repo and delete the broken one. Problem solved." [insert horrified face here] – Obsidian Feb 02 '22 at 15:06
  • @MathematicalOrchid In reality, it's not a problem. The "extremely small window" is one week by default, and you can easily make it a year or 5 years or forever. But the only practical way to lose all refs to a commit is if you successfully rebase a branch, at which point you have a new commit that contains all of the previous content (and history), so the previous commits are indeed garbage that you really do not care about. Your content is safe. – William Pursell Feb 02 '22 at 15:32
  • Recovering the hash of the old commit is (usually) trivial with `git reflog` – William Pursell Feb 02 '22 at 15:39

3 Answers3

3

The question seems to be based on a misunderstanding. Actually, deleting something is not possible in Git. You cannot mutate (edit) or remove (delete) a commit in Git. Git's job is to preserve history, and commits are the stuff of history in Git.

There is a rule that if a commit is not reachable (by way of some commit's name, such as a branch or tag, plus that commit's parent chain) it may be removed as part of garbage collection. But even that does not happen immediately when the commit becomes unreachable. If you make a commit unreachable by mistake, you literally have weeks to correct that mistake.

That said: sure, you can commit any idiocy you want. The repository (containing the commits) is in a .git folder. Nothing stops you from deleting it, any more than anything stops you from erasing your hard disk.

matt
  • 515,959
  • 87
  • 875
  • 1,141
1

I've never seen anybody recommend deleting Git history. Standard best practices agree that you should "never re-write history", which would include deleting commits.

That said, this is a situation that you could theoretically create, but there are multiple safety nets that would prevent you from doing so in practice:

  1. To push a history that conflicts with the server's view of a branch, you have to add extra flags, e.g. --force or --force-with-lease. You can't do this by accident.
  2. If someone does force-push a totally broken history, everyone on the team will notice because they will get merge conflicts as soon as they try to pull. (This is an operation that can be --aborted safely - local repos will not become unintentionally broken.)
  3. Even if it somehow goes unnoticed, you still have an extremely good chance of being able to recover. Since the Git repo is distributed and not centralized, every developer has their own copy of the repo. All it takes is one person with a "good" copy, and you can recover from that state. (This includes any branch that was forked from main/master - that branch will still have the full history even if main/master is broken.)
0x5453
  • 12,753
  • 1
  • 32
  • 61
  • What happens if, say, somebody accidentally deleted the wrong branch? (The fact that you *can* delete an entire branch is frankly terrifying.) Say somebody accidentally deleted the 2.2 release branch, and we only notice 6 months later, long after the GC has already pruned all the commits. Now what? – MathematicalOrchid Feb 02 '22 at 14:55
  • @MathematicalOrchid You've already had an answer. If you really something fear _that odd_ to happen, simply turn gc.pruneExpire to never once and for all, and this will never happen. You can also forbid people to `push -f` onto the master branch, while letting them do what they want on their own. Now consider the problem from the other end, what happens if you've created a branch by mistake, or for your own temporary use only ? Because both of these actually happened to me using Mercurial. – Obsidian Feb 02 '22 at 16:47
0

Git offers possibilities to edit or manipulate the history. These features provide flexibility and comfortness and they address specific drawbacks in git. Good examples for those kind of features are rebasing, worktrees and things like 'commit --ammend'. Using them the right way (take care and follow best practices) makes git (and therefore your workflow) more powerful.

Important to say, these operations DO NOT delete things from the history. You can always revert to a previous state, even if you published them.

Git indeed has abilities to delete things in the history. Doing that is dangerous and you should really know what you are doing. Because of this, people rarely do it and actually I have never seen anyone doing it. It is quite hard to completely(!) delete a commit or content anyway. Even if you revert your branch to a previous commit, your 'deleted' commits are still present (but detached). You can access them using 'git reflog' and honestly, I have no idea how to absolutely delete them.

Felix
  • 82
  • 4