-1

I've seen other questions like this one about removing old files from Git. However, all solutions I found include rewriting the history, meaning that there has to be force-pushing involved and fellow developers will experience issues.

Is it possible to remove a file from Git's object database in order to decrease the repo size when cloning without having to change the history? The obvious problem here is what happens when you checkout a commit where the deleted file appears, but I'm fine with Git showing me a warning message that the file is missing. Can Git do that?

dodov
  • 5,206
  • 3
  • 34
  • 65
  • 2
    No, that's not possible by definition. You have to at least rip out (or modify) all parts of the history that mention that file. If that is only on some non-main branch then severing that branch might be sufficient, but any commit that is based on one that had that file will have to be rewritten. – Joachim Sauer Oct 06 '21 at 09:43
  • 2
    AFAIK: Nope, not possible. The commit id is a hash of the files included in the commit. You change the files, you change the commit id. – Oliver Oct 06 '21 at 09:43
  • 1
    No. By design, there is no such way. The commit graph forms a [Merkle tree](https://en.wikipedia.org/wiki/Merkle_tree) so that one knows if the commits have been messed-with. The main thing you *can* do here is make a *shallow clone*. Alternatively, stick large files elsewhere (e.g., a separate LFS server). – torek Oct 06 '21 at 09:43
  • @torek using LFS would still require a rewrite of the history, wouldn't it? You'd need to replace the actual files with their LFS references, which would change the commit hashes? – dodov Oct 06 '21 at 09:49
  • No: with the LFS wrappers, you don't store the big files in Git at all. Instead, you store a reference to an LFS server that stores a big file. So the commits have what amount to instructions: "go over here to get some file". Follow the instructions and you get the file. Use plain Git, or disable the instruction-following, and you get only the instructions. – torek Oct 06 '21 at 10:10
  • Having never stored the files *in* Git, you sidestep the problem. (Of course if you've made the mistake of storing the files in Git, well...) – torek Oct 06 '21 at 10:11

1 Answers1

0

You can't have it both ways - if the file is part of the history, it will take up space. If you don't want it to take up space, it can't be part of the history.

If you're concerned about the speed of cloning, you could clone only the newest commit by adding flags like --depth or --shallow-since to your git clone command.
Then, in the edge cases where you really need to go further back in time, you could explicitly fetch the missing commits.

Mureinik
  • 297,002
  • 52
  • 306
  • 350