6

I have created a simple script to migrate fairly large SVN repository. Instead of using git svn clone, I am using git svn init and git svn fetch so that I could specify the revision and fetch it chunk by chunk. More or less, it is something like this:

while [ "$CURRENT_REVISION" -lt "$MAX_REVISION" ]; do
  END_REVISION=$((CURRENT_REVISION + 100))
  if [ "$END_REVISION" -ge "$MAX_REVISION" ] 
  then
    END_REVISION=$MAX_REVISION
  fi

  git svn fetch -r "$CURRENT_REVISION":"$END_REVISION"  --authors-file="$AUTHORS_FILE" 

  #increasing the current and end revision
  CURRENT_REVISION=$END_REVISION
  END_REVISION=$((CURRENT_REVISION + 100))
done

However, I understand that by default the behavior of the fetch/clone will not retain empty directories. Thus, I might need to manually check in those empty directories (*which I'm trying to avoid).

There is a --preserve-empty-dirs parameter in the git svn clone but not in git svn fetch.

Is there any workaround to trick this out?

UPDATE

Even though it is not mentioned in the official documentation that we can use the config key for the fetch, it is actually works

There is detailed explanation by @Vampire related to this question. So I'll simplify this.

After doing the init repository, I had to change the configuration of my remote branch:

git config svn-remote.<remote name>.preserve-empty-dirs "true"

git config svn-remote.<remote name>.placeholder-filename ".gitkeep"

You can verify the configuration by looking at /.git/config. Just do normal fetch and your directory will be preserved.

alfonzjanfrithz
  • 716
  • 7
  • 16
  • Do you use this for a one-time conversion, or do you want to use git-svn for using Git as frontend to the Subversion repository that will still be the canonical source for your code? – Vampire Jul 12 '16 at 12:20
  • Yes, I can say it's a one-time conversion. – alfonzjanfrithz Jul 12 '16 at 15:40

1 Answers1

3

git-svn is not the right tool for one-time conversions of repositories. It is a great tool if you want to use Git as frontend for an existing SVN server, but for one-time conversions you should not use git-svn, but svn2git which is much more suited for this use-case.

There are pleny tools called svn2git, the probably best one is the KDE one from https://github.com/svn-all-fast-export/svn2git. I strongly recommend using that svn2git tool. It is the best I know available out there and it is very flexible in what you can do with its rules files.

If you are not 100% about the history of your repository, svneverever from http://blog.hartwork.org/?p=763 is a great tool to investigate the history of an SVN repository when migrating it to Git.


If you still want to use git svn, you don't need to manally hunk your fetching. Just do git svn clone ..., if you want to pause, cancel the execution and then do git svn fetch to continue the fetching process. It will automatically continue where it stopped working and even validate the last fetched revision on whether it was fetched completely or needs to be refetched.


If you still want to manually hunk your fetching, be aware that git svn clone ... is exactly the same as git svn init ... followed by git svn fetch, with the exception that according to the clone specific parameters the config properties svn-remote.<remote name>.preserve-empty-dirs is set to true and svn-remote.<remote name>.placeholder-filename is set to the argument you gave it between init and fetch. So just do that manually after your init and before your fetch and you are fine.

One note though: preserve-empty-dirs afair is only necessary if you need to have those empty directories in the Git repository and you are only using the Git repository afterwards. As long as you only use git svn as frontend to an exiting SVN repository, the emtpy dirs should automatically be created in your workspace and need not be part of the actual repository history. What preserve-empty-dirs does, is to add an empty .gitignore file (or whatever you configured with the other parameter / config property) to the commit, so that the folder essentially is not empty anymore. This is done so as Git is a content tracker, not a filesystem tracker. It tracks your sourcecode that happens to be stored in files, not the files and folders itself. That is also why there is no such thing as an explicit move or copy operation in Git, because moves and copies are determined on the fly where necessary and wanted. Technically it is just a remove and add for move or a simple add for copy.

Vampire
  • 35,631
  • 4
  • 76
  • 102
  • Thanks! I will give an update on this solution. I will basically add `svn-remote..preserve-empty-dirs` on my config and re-fetch the repo. Too bad that the official documentation did not explicitly tell that `preserve-empty-dirs` could be added in the config key. Thank you for the enlightenment. I might also try another solution by checking out all of the svn active branch. get some command to touch a dummy file in all of empty dirs, copy back those dirs to the git workspace and create the new git commit for that. – alfonzjanfrithz Jul 13 '16 at 03:37
  • I get this error: `svn-all-fast-export: /build/subversion-8E3yhQ/subversion-1.9.3/subversion/libsvn_subr/dirent_uri.c:972: svn_dirent_join: Assertion \`svn_dirent_is_canonical(base, pool)' failed.` Back to `git svn` :-( – peterh Nov 27 '17 at 13:49
  • @peterh you have to have the repo locally. See here: https://github.com/svn-all-fast-export/svn2git/issues/3 – Vampire Nov 27 '17 at 18:03
  • 1
    If `git-svn` is not the right tool to for one-time conversions of repositories, then why did it get more than 1400 upvotes in this question: https://stackoverflow.com/questions/79165/how-do-i-migrate-an-svn-repository-with-history-to-a-new-git-repository – josch Feb 06 '18 at 17:47
  • Because it is most easy to use, shipped with Git and the suberstition that it is suited for one-time migration is very common. Yet, that is just an abuse of it and not what it was developed for. Read for example my answer at https://stackoverflow.com/questions/48444548/exclude-deleted-svn-branches-tags-on-migration-to-git-git-svn for some of the points that make `git-svn` unsuited for one-time migrations. – Vampire Feb 07 '18 at 08:33