1

I cloned my remote using --mirror:

git clone --mirror git-user@git-url.example.com:my-repo-name.git

Then I worked on the repo to remove some large files in the repo and some unwanted branches. The overall repo size reduced about 10%.

I made a bundle of this reduced-size repo intending to push it. Then I test-restored this bundle to see what it looked like.

git clone my-repo-name.bdl my-repo-name 

The restored bundle is about 75% smaller but it contains all the branches, tags etc. and seems to have a complete history as I want. Should I trust this method of "archiving" as I've been told? The massively reduced file size makes me worry this is not correct. What did the restored bundle potentially leave out?

n8thanael
  • 330
  • 3
  • 10

1 Answers1

2

The main purpose of a bundle is to communicate changes to a repo to which you can't push (or which can't fetch from you), e.g. because of lack of network access. But, they can be useful for a number of other things.

When you cleaned up your original repo, what steps did you take to ensure that the removed items were really purged from the repo? Since you got some reduction in size, I assume you ran git gc; but did you make sure to clear away any reflogs first, as well as all refs that might still point into the unwanted history? The old repo may still have a bunch of the history for stuff you removed, and that might account for the discrepancy.

That is, since your bundle won't have the reflogs and wouldn't include "weird" refs - like maybe the backup refs created by filter-branch - it is more likely to be a true minimal history of the refs you put in it; plus it's possible that some space was saved by repacking. (Similar clean-up can often be had by cloning the cleaned-up repo.)

If a ref is written to the bundle, and if you can apply the bundle to an empty repo, then you can rest assured that the full history of that rep (including the directory structure and file content at each commit point) is present. It would be very surprising if that didn't account for the bulk of the required size of the repo.

If the history were somehow corrupt and missing data, git should've complained about it; but if you're worried, maybe git fsck on a repo to which you've applied the bundle would provide some additional assurance.

What could be missing? Well, refs that you didn't bundle. So: Tags that aren't reachable from any branch. Notes refs maybe (if you use them). Replacement refs maybe (if you use them). Remote refs, I guess, though if you're doing a rewrite you probably don't want them. Or, if you just gave too narrow a list of branches in creating the bundle. I can't say that list is exhaustive; in general, like I said, "other refs". You could run git for-each-ref in the old and new repos and compare the results to see what's not in the new one, if anything.

You could also have bundled a "shallow" history, but you'd have had to specify that you wanted it, and it wouldn't apply easily to an empty repo. So if that's not what you're trying to do, it probably isn't what happened.

Mark Adelsberger
  • 42,148
  • 4
  • 35
  • 52
  • what steps did you take to ensure that the removed items were really purged from the repo? Checked for giant files: https://stackoverflow.com/questions/10622179/how-to-find-identify-large-files-commits-in-git-history#answer-42544963 Cleaned: https://docs.acquia.com/article/removing-large-items-your-sites-git-history Removed unwanted directories: https://stackoverflow.com/questions/10067848/remove-folder-and-its-contents-from-git-githubs-history#answer-32886427 Removed unwanted branches: $ git branch -D [branch] Checked Size: $ du -d 1 -h – n8thanael May 03 '18 at 17:20
  • Thanks for your help... n8thanael@***** MINGW64 /c/repoclone/clone.git (BARE:master) $ git fsck Checking object directories: 100% (256/256), done. Checking objects: 100% (43277/43277), done. dangling commit (there are 12 of these....) n8thanael@***** MINGW64 /c/repoclone/bundle.git (master) $ git fsck Checking object directories: 100% (256/256), done. Checking objects: 100% (31096/31096), done. – n8thanael May 03 '18 at 17:24