5

I have migrated an old cvs repository with cvs2git (cvs2svn). The resulted dump file is now 72GB big and my trials to import the dump via git fast-import always fail because of an out-of-memory error:

fatal: Out of memory, malloc failed (tried to allocate 6196691 bytes)
fast-import: dumping crash report to fast_import_crash_13097
error: git-fast-import died of signal 11

Whereby my System has 32GB RAM and 50GB swap. I am running the import on a Red Hat 5.3 with Git 1.8.3.4 (gcc44, python2.6.8, cvs2svn2.4.0). I have also tried to unlimit stack size and file descriptors, but the memory error is still there.

Has anybody any idea?

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
user2451418
  • 1,387
  • 4
  • 15
  • 27

2 Answers2

4

The idea is to:

Then you would import the cvs (sub-)repos into individual git repos.
Since git is distributed, and not centralized, you want to keep the size of each git repo reasonable.

Community
  • 1
  • 1
VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
  • Thanks. The problem with splitting is that I would lost the cross-component tags and branches, or not? Also I have searched for big files in my cvs repo, there are only few files which are 130MB big. Yes, the idea of individual repos is the right way, but in a first step after migrating from cvs to git one should use a central repository in order to make it easier for the users to adopt the new tooling. – user2451418 Aug 12 '13 at 08:29
  • "I would lost the cross-component tags and branches, or not?": or not, if you group dependent repos within one git parent repo, as submodules: http://git-scm.com/docs/git-submodule and http://git-scm.com/book/en/Git-Tools-Submodules – VonC Aug 12 '13 at 08:50
  • OK, so you would say that if I would split my cvs repos, migrate these with cvs2svn into a git-compatible format, import the dump-files into separate git repos and merge these as submodules in a central git repo, then I would not lose anything of my history? - Then all branches and tags are still available? – user2451418 Aug 12 '13 at 09:58
  • @user2451418 branches will be there for each repo, but not in the parent repo. The idea of submodule is to allow dependent modules to be linked *going forward*. I wouldn't know of a way of reconstituting that dependencies for all past commits. – VonC Aug 12 '13 at 10:24
  • @user2451418 if those modules are tightly coupled and really inter-dependent, then submodule isn't the best approach any, and cleaning the CVS repo remains your best option: the goal is to import in a git repo a CVS repo as small as possible. – VonC Aug 12 '13 at 10:25
0

I also had faced the same issue but it is solved now. Please download the latest cvs2svn which has the fix to reduce the size of the dump considerably. It reduces the metadata for symbol commits.Version is cvs2git version 2.5 or later.

(You can view the change made in https://github.com/mhagger/cvs2svn/commit/fd177d0151f00b028b4a0df21e0c8b7096f4246b)

Arjun
  • 153
  • 1
  • 8