3

Is it possible to mirror a repository (in your choice of SVN, Git, Hg, etc.) by publishing changes to a "dumb" fileserver, and re-assembling them at the other end?

Diagram of the process

Possible concerns include

  • Handling of files that are moved/renamed/deleted
  • Binary files
  • Empty directories

For example, I understand svn diff > r1.patch means you lose the binary files, because SVN doesn't deign to include them.

Note: this isn't putting a repository on the fileserver (unless there's a way to do that without uploading the entire repository file) because we're trying to minimise bandwidth. Also, storing diffs instead of repository means that encryption is possible.

pidabrow
  • 966
  • 1
  • 21
  • 47
OJW
  • 4,514
  • 6
  • 40
  • 48
  • Please find my updated answer below, hope it will help you: http://stackoverflow.com/a/14402464/1047741 – shytikov Jan 22 '13 at 17:21

4 Answers4

3

Yes, it's possible

Subversion

  • Make git-patch from revision N (svn diff -c N --git PATH > PATCH.N);
  • Store patch;
  • Apply patch on another side (svn patch PATCH.N <WCPATH>);

Mercurial

Single-changeset exchange (also bad for mergesets)

  • Export changeset N as patch (hg export -g -r N -o %b-%r.patch) (-o with filename for case of multirepo-storage in one dir);
  • Store patch;
  • Apply patch on another side (hg import --exact -s 100 FILENAME.patch).

Rangeset exchange

Bundle

  • Bundle range N:M (hg bundle --base parent(N) --rev N::M N-M.hg);
  • Store patch;
  • Apply patch on another side (hg unbundle -u N-M.hg).

Export

  • Export rangeset N:M as patches (hg export -o %b-%r.patch -r N::M) (set of patches with filename "repobasebame-revno.patch");
  • Store patches;
  • Apply patch on another side consequentially (from least to most).
shytikov
  • 9,155
  • 8
  • 56
  • 103
Lazy Badger
  • 94,711
  • 9
  • 78
  • 110
2

For git there is the bundle command. It creates binary archive of the revisions given. You can put any piece of your history into the bundle either on per commit basis or in bunch, copy to the file server and then restore back to the repository.

git bundle create file.name revisions..list — command for creating bundle.

git bundle unbundle file.name — command to restore revisions.

Definetely, you should come up with chronologically naming for your bundles not to mix them up.

It works for git and, as far as I remember, hg has bundle command as well. This is the approach as you draw it.

Another way out might be just init new intermediate repository in your Dropbox folder and push there your commits from main repository and let Dropbox synchronize it with mirror. However in this case the pulling into mirror repository should be done only after Dropbox synchronization had finished. Otherwise the data might be inconsistent, since git uses whole lot of small files to hold repository contents. It is possible to avoid such behavior by packing repository content. But if you what to be on the safe side, the bundle approach will work for you best...

EDIT: Regarging svn I got another clue recently. If for git and hg you can use standard backup approach to achieve what you need, why could you try the svn's standard backup approach as described in this Q&A?

svnadmin dump repositorypath -r LO_REV:HI_REV > backupname.svn to backup revisions.

svnadmin load repositorypath < backupname.svn to restore data.

Community
  • 1
  • 1
shytikov
  • 9,155
  • 8
  • 56
  • 103
0

Is there any reason you are not just using a repository hoster like github? That will minimize bandwith as git only pushes diffs.

Chronial
  • 66,706
  • 14
  • 93
  • 99
  • Yes, several reasons. Including: cost per unit storage, and ability to prevent fileserver from seeing the data being transferred. Packaging changes this way also allows sending changes by email or on a USB key not large enough to hold the entire repository. – OJW Jan 22 '13 at 18:28
  • `cost per unit storage:`: Github is incredibly cheap and these costs should not be relevant compared to your development costs. `prevent fileserver from seeing the data`: Unless you are developing something for the military or a secret service, I’d say it’s a little bit extreme to distrust github with your data. `Sending changes by email or on a USB key`: Sure thing, but unless your repository is extremely large, the time and effort for this are way higher than for a simple `git fetch` from a remote. – Chronial Jan 22 '13 at 19:06
  • Don’t get me wrong: I’m not trying to say DO IT MY WAY, IT’S THE ONLY RIGHT ONE, but it seems that you might have made your decision not to use the standard solution to your problem rather rashly. Using bundles will most likely be way more expensive than using a git hoster, as the time your devs are spending on bundling etc. will probably amount to way more than the cost for a git hoster would be. – Chronial Jan 22 '13 at 19:10
  • unless.. unless.. might.. most likely.. will probably... -- if any of those conditions doesn't hold then "use github" doesn't answer the question (and ignores the case where no network connection is available, hence USB key needed). Thanks for the reply, but the question is specifically about something other than using github. – OJW Jan 22 '13 at 20:30
0

git does a pretty good job by itself in just sending the needed differences, so this isn't needed at all. Same for mercurial (hg), and I'd be very surprised if bzr didn't. Centralized VCS like svn or cvs don't, obviously. Your shipping pieces over to a fileserver makes sense only if there are other restrictions (if so, we need to know about them to be able to really help).

You could also use rsync(1) to copy repositories over, but in that case you can't count on the VCS ensuring that nobody sees half-updated stuff, be extremely careful if you do something like that. In general (except for bzr) you can just, e.g., backup and restore the repository elsewhere and it should work fine.

In any case, never blindly trust what somebody hiding behind a colored square with missing bits tells you... check for yourself, read the documentation, look for recommended practices, and experiment a bit.

vonbrand
  • 11,412
  • 8
  • 32
  • 52
  • _"Centralized VCS like svn don't, obviously"_ - to be fair, if we permit the server to be running a version-control program instead of just a dumb fileserver (which isn't the case in this question, as discussed in Chronial's answer), then **svnsync** would be the equivalent command. – OJW Jan 23 '13 at 13:49