1

I'd like to migrate our SVN repository to git.

Our current repository is a huge singleton pile comprising a number of Visual Studio solutions, all residing in separate sub directories of the repository.

When transforming it to git I'd like to split the SVN repository into separate git repositories for each solution while maintaining each solution's history at the same time.

I don't want the history of the whole SVN repository in all of our future git repositories. All I want in these future git repositories is the history of a particular sub directory.

Is this possible?


Current SVN repository file structure:

svn_base
   |-- Solution1
   |   |-- 1.cs
   |   |-- 1.csproj
   |   |-- 1.sln
   |-- Solution1
   |   |-- 2.cs
   |   |-- 2.csproj
   |   |-- 2.sln
   |-- Solution3
   |   |-- 3.cs
   |   |-- 3.csproj
   |   |-- 3.sln

Desired git respository file structure:

Solution1
   |-- .git
   |-- 1.cs
   |-- 1.csproj
   |-- 1.sln

Solution2
   |-- .git
   |-- 2.cs
   |-- 2.csproj
   |-- 2.sln


Solution3
   |-- .git
   |-- 3.cs
   |-- 3.csproj
   |-- 3.sln
AxD
  • 2,714
  • 3
  • 31
  • 53
  • 1
    With git-svn, you tell it where the trunk is (also directory where you hold branches/tags) so it should be no problem to separate each project into their own separate git repo. – eftshift0 Jul 28 '21 at 18:14
  • 3
    You could also first [convert SVN to Git](https://stackoverflow.com/questions/79165/how-do-i-migrate-an-svn-repository-with-history-to-a-new-git-repository), then [split your history in different repositories](https://stackoverflow.com/questions/359424/detach-move-subdirectory-into-separate-git-repository). – Elias Holzmann Jul 28 '21 at 18:15
  • I'm not sure how good the tool is for converting SVN to Git and splitting there- maybe it's great. But I know the new git-filter-repo tool is great, so if it were me, I'd go with @sk_pleasant-EliasHolzmann's suggestion and convert SVN to one big Git repo first, then use git-filter-repo for the rest. – TTT Jul 28 '21 at 19:51
  • As a side note, and without knowing anything about your codebase, my gut feeling is one solution per repo may be a little to granular for a repo. You didn't say why you want to split, but unless each one is excessively large, or perhaps you have classifications restricting certain people from seeing certain code, I'd probably leave it in a single repo until I had a good reason to split it out. (Maybe you do.) – TTT Jul 28 '21 at 19:55
  • @sk_pleasant-EliasHolzmann: Excellent hint! I tried on a sample repository and `git subtree` worked like a charm. (Unfortunately, the man page for this command seems to be missing.) - If you want to post your comment as an answer I'd gladly like to upvote. – AxD Jul 29 '21 at 00:43
  • @AxD Thanks, I fleshed my comment a bit out (and added why `git subtree` is indeed **not** optimal here). To the missing man page: My distribution (Arch Linux) has this man page, and the web knows it, too: https://manpages.debian.org/testing/git-man/git-subtree.1.en.html This seems to be a problem of your distribution. – Elias Holzmann Jul 29 '21 at 10:28
  • @sk_pleasant-EliasHolzmann: I'm using Windows, so I've been using Google to search for `git subtree`. Usually there's online documentation for each of the commands (e.g. https://git-scm.com/docs/git-commit ), but there is no documentation for https://git-scm.com/docs/git-subtree. Perhaps someone may notify the git team about this missing online documentation. – AxD Jul 29 '21 at 16:31

2 Answers2

5

While the answer @acran gave does solve the problem, it is also possible and sometimes advantageous to first convert the SVN repository to Git and to then split the big monorepo into multiple smaller repositories.

1. Converting SVN to Git

If your SVN repository has a standard layout (subdirectories branches, tags and trunk) and you don't need any other bells and whistles, this is quite easy:

$ git svn clone <url_to_subversion_repo>

This command has two gotchas:

  1. git svn uses the SVN login as Git author name. It also uses some default mail address (<author_name>@localhost, I think, though I am not sure). If this is not what you want, you can use an authors file. Add a file user_mapping.txt mapping SVN users to git users:
    svn_user_1 = Git User 1 <user1@example.com>
    svn_user_2 = Git User 2 <user2@example.com>
    
    And then call git svn clone with this file:
    $ git svn clone --authors-file=user_mapping.txt <url_to_subversion_repo>
    
  2. As SVN tags can change, git svn imports them as Git branches. If you want, you can convert them.

git svn clone checks out every revision of your SVN repo in order from the SVN server – if you have a big repository, this will take a while (my experience were multiple hours for ~50,000 revisions, I think, though I am not sure, this was years ago). If possible, you may want to run this command on the SVN server, espacially if you have a slow connection to it. Either way, go grab a cup of coffee (or five).

2. Splitting the Git repository

There are multiple tools to split up Git repositories into sub repositories. See for example this question. When I did this a few years ago, I used git filter-branch, but this tool is now deprecated – you may still use it, or you may use git filer-repo, though I don't have any experience with this tool.

The most upvoted answer to the question I linked uses git subtree filter – I suggest not to use this answer, as git subtree filter only converts one branch, in effect removing all other branches from your subrepositories.

Advantages

What are the advantages of this answer over converting every sub repository via git svn clone?

  • You only need to clone the SVN repository once. This is probably faster than cloning the sub folders for each project (though I have not tested this, it is only an educated guess).
  • Cloning a SVN repository with the standard layout is better tested than cloning a SVN repository with a non-standard layout. In my experience, git svn does not always do what you want it to, so a more standard usage is probably more likely to result in what you want.
  • If you want to rewrite the history of your new Git repositories (for example, to remove big binary files), you can rewrite the history of the monorepo between step one and step two. It would be a bigger effort to do this for every new sub repository.
Elias Holzmann
  • 3,216
  • 2
  • 17
  • 33
2

If your projects are neatly separated into their own subdirectories this should be quite straight forward using the --trunk parameter to git svn init/git svn clone:

git svn clone --trunk=Solution1 $SVN_URI ./Solution1

This will clone the only the history of the subfolder Solution1 into a new git repository in the directory ./Solution1. It will only include commits that touch files in that subfolder and it will adjust the relative path so that the subfolder is the root directory of the new git repository.

acran
  • 7,070
  • 1
  • 18
  • 35
  • Very good answer! I'm currently running your suggestion. Going to wait for the outcome before I vote. I should note that - on Windows - the trunk name should be given with forward slashes, not back slashes, i.e. `--trunk MySubDir/Solution1`. – AxD Jul 29 '21 at 00:45