10

I have to work on two repositories and want to move one directory with code between the two while keeping revisionhistory.

I read a few questions here on SO, but am still not sure which way to go. Our Repositories are HUGE (files (orkingcopy) not including revisions>several GB), since everything is checked in (code +designdata + ...).

The solutions I have seen so far are:

  1. svnadmin dump + filter + import: not an option due to repository size
  2. svnsync: We already have data in the second repository (the repositories are huge already, I don't think merging them is a good idea, besides decidng that is not my job), and from what I gathered this requires the second repository to be empty.
  3. Third Party Solution: crashes repeatedly, because it can not delete a file because "another process hast the file opened" (enevthough i can remove the file via the os, and it gets created by the script)

Are there other solutions, approaches to this, or am I missing something about one of the solutions?

bahrep
  • 29,961
  • 12
  • 103
  • 150
ted
  • 4,791
  • 5
  • 38
  • 84
  • 1
    Why is the first option a problem? You can pass the dump through a filter without saving to a file. – Dmitry Pavlenko Oct 22 '12 at 14:28
  • @DmitryPavlenko: considering that the repository is approximately 20Gb checked out and the revision numbers are arround 6 digits (112 xxx) at the moment, the dump must be considerable larger. Dumping the complete repository to move a code directory which is a couple of megs in size seems overkill. This puts a lot of unnecessary stress on the system (Check out every revision/dump => "copy" 200G or considerable more (have to check the repro size) => run the filter on 200G of Data. If you consider that this will run over network which is allready under heavy use it inhibits our workflow. – ted Oct 22 '12 at 15:02
  • @DmitryPavlenko: In short, I am not sure I have 200G at my disposal (not an admin just a user), the network impact and overhead. Now consider that I want to keep working while this ammount of data is moved. But what does svn dump do if there are repository accesses in the mean time? – ted Oct 22 '12 at 15:04
  • I really think option 1 is your best bet. Running the dump/filter directly on the svn server will alleviate network load. It may be a one time pain - but you'll be happier through the life of the repo. – thekbb Oct 22 '12 at 15:05
  • you will want the repo offline when you take the dump. – thekbb Oct 22 '12 at 15:06
  • Near-duplicate of http://stackoverflow.com/questions/417726/how-to-move-a-single-folder-from-one-subversion-repository-to-another-repository – reinierpost May 27 '16 at 13:53

1 Answers1

9

The svnadmin dump + filter + import works even with large repositories.

There are some things you need to do for performance. Find out which revision was the one that first created the folder you want to copy. Then check the log to find the last revision that modified anything in that folder. You only need to dump revisions in that range.

Use the --incremental flag to svndump.

Do not try to use the Deltas flag with SVN Dump. SvnDumpFilter won't work on dumps created with "deltas". Don't try to save the huge dump to a file and and run SvnDumpFilter on the file. Instead do it in one step with a pipe.

If your start revision was 10000 and your end revision was 20000 and the path you want to copy was projects/source, the command should look like

svnadmin dump --incremental -r10000:20000 YourRepoPath | svndumpfilter include projects/source --drop-empty-revs --renumber-revs > source.dump

Followed by an svnadmin load command to load the dump into your other repo.

maddoxej
  • 1,662
  • 13
  • 19