4

I have a SVN repository that is 8 GB when I measure the folder size on the server.

But when I check it out locally (simply checking out the full repository from the root, all branches/tags) its 50+ GB (still counting).

It seems that SVN does a good job at compressing its content. How come the size is so different?

And is there any way of computing the actual size of the repository without having to do a full checkout locally?

carlspring
  • 31,231
  • 29
  • 115
  • 197
u123
  • 15,603
  • 58
  • 186
  • 303
  • 2
    if one of the given answers is the accepted answer for the question then you should accept it by clicking the hollow check mark next to the answer, so that it becomes green. If you found a different solution then you should describe it here and accept it. – Dialecticus Oct 03 '15 at 09:40

3 Answers3

5

When you make a branch of a folder in repository the contents of that folder is not actually copied. No point in wasting space on mere duplication of data. For more info see the chapter about branching in the good book, especially boxed text at the end of the section Creating Branches, titled Cheap Copies.

As for the question of the actual size, there is no way to compute that that I know of, but I doubt if it is necessary. If there is no room, then make some room. Main point is that you don't need all the branches in your working copy, so checkout only what you really need.

Community
  • 1
  • 1
Dialecticus
  • 16,400
  • 7
  • 43
  • 103
  • 2
    Therefore the obvious next step is the advice: *never* check out the full repository recursively from root. – Ben Jul 13 '15 at 14:26
  • @Ben added the advice :) – Dialecticus Jul 13 '15 at 15:37
  • But not get too few also. In my expirience it is better to get branch. Not less, not more. It is easier to support different properties on branch level (not actual for 1.8), always merge whole branch, work against while sources – Sergey Azarkevich Jul 13 '15 at 15:46
3

You can try to calc size of revision without checkout (require some tricks and hand-work) for any subtree in repo of full tree (full tree is, anyway, bad idea and wasted space)

  1. svn ls -v -R ROOT/OF/ > file.log
  2. Remove all strings about directories, not files, in this log

1300 lazybadg май 07 2014 city/

(slash in last position of string, string contain one field lesser, than file-related string)

  1. Sum all values (in bytes) from 3-rd field (in terms of awk) of every string in cleaned log
625 Infinity         1272 янв 22  2010 city/Siberia.bmp

Sample from my small repo

Full ls

  5 lazybadg              фев 07  2014 ./
  2 lazybadg              ноя 28  2013 branches/
  2 lazybadg              ноя 28  2013 branches/FullHTML/
  2 lazybadg      1521542 ноя 28  2013 branches/FullHTML/natasha_i_budushee.html
  2 lazybadg          146 ноя 28  2013 readme.textile
  1 www-data              ноя 27  2013 tags/
  5 lazybadg              фев 07  2014 trunk/
  5 lazybadg        46394 фев 07  2014 trunk/G1.txt
  2 lazybadg        22203 ноя 28  2013 trunk/G10.txt
  2 lazybadg        18974 ноя 28  2013 trunk/G11.txt
  2 lazybadg        23795 ноя 28  2013 trunk/G12.txt
  2 lazybadg        24996 ноя 28  2013 trunk/G13.txt
  2 lazybadg        27358 ноя 28  2013 trunk/G14.txt
  2 lazybadg        24855 ноя 28  2013 trunk/G15.txt
  2 lazybadg        22481 ноя 28  2013 trunk/G16.txt
  2 lazybadg        40970 ноя 28  2013 trunk/G17.txt
  2 lazybadg        27761 ноя 28  2013 trunk/G18.txt
  2 lazybadg        33974 ноя 28  2013 trunk/G19.txt
  2 lazybadg        38287 ноя 28  2013 trunk/G2.txt
  2 lazybadg        30880 ноя 28  2013 trunk/G20.txt
  2 lazybadg        24692 ноя 28  2013 trunk/G3.txt
  2 lazybadg        38140 ноя 28  2013 trunk/G30.txt
  2 lazybadg        36509 ноя 28  2013 trunk/G31.txt
  2 lazybadg        57408 ноя 28  2013 trunk/G32.txt
  2 lazybadg        74241 ноя 28  2013 trunk/G34.txt
  2 lazybadg        74800 ноя 28  2013 trunk/G36.txt
  2 lazybadg        22123 ноя 28  2013 trunk/G4.txt
  2 lazybadg        12631 ноя 28  2013 trunk/G5.txt
  2 lazybadg        32373 ноя 28  2013 trunk/G6.txt
  2 lazybadg        16433 ноя 28  2013 trunk/G7.txt
  2 lazybadg        25243 ноя 28  2013 trunk/G8.txt
  2 lazybadg        17669 ноя 28  2013 trunk/G9.txt

Dirs removed

  2 lazybadg      1521542 ноя 28  2013 branches/FullHTML/natasha_i_budushee.html
  2 lazybadg          146 ноя 28  2013 readme.textile
  5 lazybadg        46394 фев 07  2014 trunk/G1.txt
  2 lazybadg        22203 ноя 28  2013 trunk/G10.txt
  2 lazybadg        18974 ноя 28  2013 trunk/G11.txt
  2 lazybadg        23795 ноя 28  2013 trunk/G12.txt
  2 lazybadg        24996 ноя 28  2013 trunk/G13.txt
  2 lazybadg        27358 ноя 28  2013 trunk/G14.txt
  2 lazybadg        24855 ноя 28  2013 trunk/G15.txt
  2 lazybadg        22481 ноя 28  2013 trunk/G16.txt
  2 lazybadg        40970 ноя 28  2013 trunk/G17.txt
  2 lazybadg        27761 ноя 28  2013 trunk/G18.txt
  2 lazybadg        33974 ноя 28  2013 trunk/G19.txt
  ...

Summation is not shown due to obviousness, BUT!!!

Real size of checkout will be bigger, than calculated size just because .svn dir with metadata also require some space in WC

  • Sum of plain size: 2336878
  • Size of real fresh checkout: 4725986
Lazy Badger
  • 94,711
  • 9
  • 78
  • 110
  • Thanks, but looks near-impossible in corporate environment. The information should be part of svn info. Size is basics. –  Oct 24 '22 at 09:02
0

I have a SVN repository that is 8 GB when I measure the folder size on the server.

But when I check it out locally (simply checking out the full repository from the root, all branches/tags) its 50+ GB (still counting).

Do not checkout a working copy from the repository root unless you need to have all project branches, tags and shelves on your workstation. Normally you don't need such a working copy for daily work with SVN.

It seems that SVN does a good job at compressing its content. How come the size is so different?

Branches, tags, shelves (and copy operations) do not take much space in the repository storage system in the server side. For example, a new branch in the repository should take minimum of space (several kilobytes). A branch or tag in SVN is a cheap copy. When you create a branch or tag, Subversion doesn't actually duplicate any data in the repository. Moreover, SVN repos use several other techniques to save the space.

However, a local working copy of the repo root on your workstation will contain all the branches as is and will take up much more space than in the repository.

And is there any way of computing the actual size of the repository without having to do a full checkout locally?

Disk usage or size of my entire Subversion repository

Just check the size of the repositories on disk.

In case you use VisualSVN Server, try the Measure-SvnRepository cmdlet. It will produce the following output:

Name                                Revisions                 Size           SizeOnDisk
----                                ---------                 ----           ----------
MyRepo                                    498             3,340 KB             4,529 KB
MyRepo2                                   479            21,313 KB            22,571 KB
MyRepo3                                   201             1,032 KB             2,226 KB
MyRepo5                                     2                71 KB                90 KB

You can also view and examine the repository storage statistics with the svnfsfs stats tool. Here is an example:

svnfsfs stats C:\Repositories\MyRepository
bahrep
  • 29,961
  • 12
  • 103
  • 150
  • Measure-SvnRepository sounds good, but user-side availability? –  Oct 24 '22 at 09:00
  • 1
    @KJ from user’s side I think that you can use the suggestion from https://stackoverflow.com/a/31394621/761095 or use the `svn export` command to download data from repository and then calculate actual file and directory sizes. – bahrep Oct 24 '22 at 10:47
  • Sorry no. Impossible in corporate environment to perform this. –  Jun 23 '23 at 10:22