1

We're currently undergoing an attempt to migrate our mercurial (in this case an ancient version of Kiln) to BitBucket and we immediately ran in issues with size (if you don't know, BitBucket imposes a rather generous 2gb repo limit - that we happened to blow by).

Anyways, I've cleaned up the sins of the past:

  • using convert with filemaps (removing binaries/static files that should never been in the repo)
  • creating separate repos for other things that shouldn't have been in the main repo
  • attempting to use generaldelta to reduce size (as per https://www.mercurial-scm.org/wiki/ScaleMercurial)
  • using branchmaps to try to consolidate old branches and their associated changesets

Even with these steps, I still have a very large manifest file, despite the "data" stored for the repo shrinking down to a "manageable" size (~600mb), my manifest file is nearly 700mb.

Some additional information: in general, we practice branch-per-feature and have two-branch track out to environments:

  • a release branch (deployed to staging and then to prod)
  • default branch (originally off of release, all features are first merged here and then to release. this branch dies and is reborn every two weeks)

One difference in this workflow is that default itself never is merged in to release (a la gitflow/hgflow). Does this uni-directional flow into default cause issues?

We "only" have 120 open branch heads, so it seems like that's manageable?

I'm obviously missing some step here (or else the repo is just completely hosed).

ddango
  • 946
  • 1
  • 12
  • 25
  • Its possible that BitBucket might not "work" for our repo, but I find that very hard to believe. I am also worried that our repo will continue to balloon if we don't fix/change _something_. – ddango Nov 24 '15 at 20:03
  • This is going to be difficult to answer on SO (i.e. without some more detailed back-and-forth discussion on repository details). Have you considered asking the question on the Mercurial mailing list (which is also read by several people who have experience managing large repositories in Mercurial as well as at least one person working at Atlassian)? – Reimer Behrends Nov 24 '15 at 20:35
  • I agree with you - I'll give that a try. – ddango Nov 24 '15 at 20:55
  • Have a look at my answer to https://stackoverflow.com/questions/6616951/can-i-optimize-a-mercurial-clone/19294645#19294645. In particular, have you done the double-cloning technique for generaldelta? – Tim Delaney Nov 27 '15 at 12:34
  • I haven't yet tried the double-clone. I'll give that a try. Thanks! – ddango Nov 30 '15 at 17:20

1 Answers1

1

Just for future reference, I followed Tim's suggestion above. My full script ended up looking like this:

hg --config format.generaldelta=1 clone --pull oldrepo oldrepo-generaldelta
hg --config format.generaldelta=1 clone --pull oldrepo-generaldata oldrepo-generaldelta2
hg convert --filemap filemap.txt oldrepo-generaldelta2 newrepo

As Tim mentioned in his linked answer - our manifests went from about 700mb down to about 40mb with the second clone.

Can I optimize a Mercurial clone?

Tim Delaney
  • 5,535
  • 3
  • 24
  • 18
ddango
  • 946
  • 1
  • 12
  • 25
  • For maximum impact, you will want your BitBucket repository to also be using generaldelta. I've done some searches but haven't been able to find any info on whether this is possible. – Tim Delaney Dec 01 '15 at 22:20