4

I have a huge SVN repository that I need to dump one time per week. The size of the repository is about 66 GB; after the command "svnadmin dump" the size is about 184 GB.

Is it possible to reduce the size of the dump, in order to have a safer backup procedure?

I have read that other people use the svnsync, but my goal is to bring out the data, so the best solution at the moment is "the dump + the copy of hooks and authorization data".

How do other people manage the backup of huge SVN repositories?

Thank you very much.

  • That's nothing, my 2GB repo lead to a dump file of 500GB! – Matthew Lock Jan 23 '19 at 05:27
  • I had a SVN dump that was 80GB. I had to write [my own SvnDumpFileParser](https://github.com/cstroe/svndumpapi), and implement [filters/mutators](https://github.com/cstroe/svndumpapi#mutators) in Java to filter out binary files and other unnecessary files and revisions, which reduced it to <1 GB after restoring the repository from the dump file. It also allowed us to upgrade from SVN 1.6 to the latest at the time (SVN 1.8). – cstroe Nov 01 '20 at 21:15

2 Answers2

2

The --deltas option should help reduce repository dump size. I.e:

svnadmin dump --deltas REPO

Alternatively you may use svnadmin hotcopy or svnadmin hotcopy --incremental for backups.

bahrep
  • 29,961
  • 12
  • 103
  • 150
Ivan Zhakov
  • 3,981
  • 28
  • 24
1

I tried some scenarios for svn dump, and that's my results. Ok, they are not 66G, but the ratios should be the same.
Original folder size in windows: 255M (one repository)

case 1: Stop service, zip that folder, restart service: 180M
case 2: svnadmin dump + zip | dump size: 1408M (!!??) zip size: 410M
case 3: svnadmin dump --delta + zip | dump size: 550M, zip size 270M

What? svn dump, even with delta, is too large for our needs... and I can stop the service at night, so we'll use solution 1, just stop VisualSVN server, zip the repositor folder, and restart service.

If you prefer the "dump" method, you can 7zip it directly without intermediate file as explained in that other solution

Community
  • 1
  • 1
foxontherock
  • 1,776
  • 1
  • 13
  • 16
  • 1
    *case 2*: nothing strange here, `svnadmin dump` without `--deltas` will have revision in full text and does not use any space saving techniques. *case 3*: dump streams do not use other space saving settings that are in SVN repositories. In most cases they will be larger than repository on disk. BTW, have you tried the new built-in [Backup and Restore](https://www.visualsvn.com/server/features/backup/) in VisualSVN Server 3.6.x? – bahrep May 19 '17 at 14:24
  • @bahrep Yes I have, the .vsvnbak file generated looks like case 1, it's not a "dump". They add .vsvnbak subfolder at the root for additional informations they need. Looks like they use svnadmin hotcopy directly to a zip file. Their backup looks like my case 1, the only difference is the zip compression ratio. But, I can't automate my backup process with the free version of visualsvn, so I'll continue execute "case 1" from batch file daily. – foxontherock May 19 '17 at 19:35
  • 1
    @foxontherok you can automate full backups with free Standard Edition of VisualSVN Server -- you can use [`Backup-SvnRepository`](https://www.visualsvn.com/support/topic/00088/#Backup-SvnRepository) PowerShell cmdlet in a script. This cmdlet makes full repo backup only, though. The Enterprise Edition enables scheduled background jobs that support efficient incremental backup. Depending on the size of your repos you might find incremental backups very useful. :) – bahrep May 22 '17 at 20:46