-1

I have a Git lfs repo with a lot of large files (and they keeps updating). Each time I git pull, I realize the current version is saved somewhere in the disk. After git pulling for a year, I realize a lot of unnecessary info is stored in the disk (the older version large files). I still want them to be in the remote repo, but i don't want them to keep using up my space. How do I keep the mostly the latest files in my local machine without altering anything in the remote repo? Thanks.

Edit: OK... I see some comments about this is not possible... But is it really? Intuitively speaking, one should be able to convert the "old large lfs files" to links in my local repo (this already saves huge spaces), whenever, I update the large files, it should just overwrite the recent files right (and I actually never alter anything other than the most recent version of the files... yet i do not want to remove them in the remote just in case...)?

More info: for example, in my current repo, .git/ occupies 5GB space(just all the unnecessary lfs files) , and my relevant_working_folder/ is only 1GB.... If I git clone the repo like it never existed in my local machine, it should be 1GB right?

user40780
  • 1,828
  • 7
  • 29
  • 50
  • 1
    Possible duplicate of [How to remove/delete a large file from commit history in Git repository?](https://stackoverflow.com/questions/2100907/how-to-remove-delete-a-large-file-from-commit-history-in-git-repository) – Makoto Jan 29 '18 at 16:13
  • *I still want them to be in the remote repo, but i don't want them to keep using up my space*, you can't do this. GIT repos are full copies all the time, it's either in your repo (and local) or it's not, there is no half way house. It sounds like these files shouldn't be in your repo at all. Some repos implement [LFS](https://git-lfs.github.com/) which may help but it depends on your vendor/hosting – Liam Jan 29 '18 at 16:13
  • https://stackoverflow.com/questions/4515580/how-do-i-remove-the-old-history-from-a-git-repository – C-Otto Jan 29 '18 at 16:13
  • https://stackoverflow.com/questions/16854425/compact-repo-by-removing-old-commits – C-Otto Jan 29 '18 at 16:15
  • Possible duplicate of [Compact repo by removing old commits](https://stackoverflow.com/questions/16854425/compact-repo-by-removing-old-commits) – Liam Jan 29 '18 at 16:16
  • @C-Otto please read [How should duplicate questions be handled?](https://meta.stackexchange.com/questions/10841/how-should-duplicate-questions-be-handled). If you believe this to be a duplicate please flag it or if you have enough rep vote to close. – Liam Jan 29 '18 at 16:18
  • @Liam I'm pretty sure it is not a duplicate – C-Otto Jan 29 '18 at 16:18
  • So why are you spamming the comments with links? @C-Otto – Liam Jan 29 '18 at 16:18
  • @Liam the linked questions might provide valuable information – C-Otto Jan 29 '18 at 16:20

1 Answers1

2

UPDATE - Originally I somehow read "git LFS" in the content of your question; I must've been mentally translating "large file" to "LFS", because if you have "large files" in your repo, you should be using LFS. Now I see that you just kinda-sorta mention LFS in your update, so maybe you've been using it and maybe you haven't.

Answer adjusted accordingly:


Not only is this possible, it is basic functionality of git lfs.

If you haven't been using lfs, then transitioning to it can be a bit of a problem, as it requires a rewrite of your repo's history. Your update can be read to suggest that you could just substitute a link in place of a file in the content of a repo; you cannot. git won't do it, and if you use non-git tools to fake it then you will be left with a corrupt repo. The storage in git is "content-addressable", which means a few things... but what's notable here is, it means the content is an integral part of the storage structure.

There are tools that support this type of rewrite - a dedicated lfs migraiton tool, as well as an LFS mode in the BFG repo cleaner. Last I checked, each had its quirks. More importantly, the rewrite will be like a giant upstream rebase messing up the branches of everyone who had cloned the original repo.

So... it's a good practice and worht thinking about, but maybe not something you can do for this repo (unless you were already using LFS). There are still options, contrary to what the comments would have you believe. So here's a bit about LFS, then a bit about doing without LFS...

Using LFS

There are several lfs-specific configuration values that help determine what should be kept in the local storage. Basically the values will help you define what a "recent branch" and a "recent commit" are, and then lfs will locally cache objects needed for recent commits on recent branches.

See the documentation for git lfs prune, as well as the --prune option of git lfs fetch, for details.

Without LFS

You could use a "shallow clone". Only newer versions of git provide good support for it, so you might have to upgrade. And you'll have to periodically "re-shallow" your history. And any given version is either "in" or "out" - there's no "past this age, drop the big files". That's the sort of thing that specifically requires the techniques LFS uses.

See the --depth and --shallow-* options of git clone for details on what you can do and how.

Mark Adelsberger
  • 42,148
  • 4
  • 35
  • 52
  • Thank you very much. Since we already use LFS, so I guess the answer for me would be trying to use git lfs prune? – user40780 Jan 29 '18 at 19:47