4

I and my small team work in Git, and the larger group uses Subversion. I'd like to schedule a cron job to publish our repositories current HEADs every hour into a certain directory in the SVN repo.

I thought I had this figured out, but the recipe I wrote down previously doesn't seem to be working now:

git clone ssh://me@gitserver/git-repo/Projects/ProjX px2
cd px2
svn mkdir --parents http://me@svnserver/svn/repo/play/me/fromgit/ProjX
git svn init -s http://me@svnserver/svn/repo/play/me/fromgit/ProjX
git svn fetch
git rebase trunk master
git svn dcommit

Here's what happens when I attempt:

% git clone ssh://me@gitserver/git-repo/Projects/ProjX px2
Cloning into 'ProjX'...
...

% cd px2

% svn mkdir --parents http://me@svnserver/svn/repo/play/me/fromgit/ProjX
Committed revision 123.

% git svn init -s http://me@svnserver/svn/repo/play/me/fromgit/ProjX
Using higher level of URL: http://me@svnserver/svn/repo/play/me/fromgit/ProjX => http://me@svnserver/svn/repo

% git svn fetch
W: Ignoring error from SVN, path probably does not exist: (160013): Filesystem has no item: File not found: revision 100, path '/play/me/fromgit/ProjX'
W: Do not be alarmed at the above message git-svn is just searching aggressively for old history.
This may take a while on large repositories

% git rebase trunk master
fatal: Needed a single revision
invalid upstream trunk

I could have sworn this worked previously, anyone have any suggestions? Thanks.

Ken Williams
  • 22,756
  • 10
  • 85
  • 147

2 Answers2

9

There are a few issues with your approach:

  • You seem to be using a pre-existing git repo, rather than one which was initialised via git svn init. Rebasing assumes a common ancestor, but if your git repo was previously initialised via git init, then git svn init will create a second root (i.e. parent-less) commit, and rebasing from one tip to the other will not work without --onto.
  • You use the -s option to git svn init, which causes it to search for branches/, tags/, and trunk/. As the warning (Using higher level...) clearly states, this results in the git-svn config pointing at the top of the svn repo, not the fromgit/ProjX subdirectory.
  • You refer to trunk even though there's no good reason for this branch to exist; git svn init actually creates a tracking branch called remotes/git-svn.

So the actual sequence you want is:

# 1st time only
svn mkdir --parents http://me@svnserver/svn/repo/play/me/fromgit/ProjX
mkdir px2
cd px2
git svn init http://me@svnserver/svn/repo/play/me/fromgit/ProjX
git svn fetch

Now hacking can occur concurrently in git and svn. Next time you want to dcommit from git to svn, you simply do:

cd px2
git svn rebase
git svn dcommit

If you already initialised the git repository, started hacking in it, and need to transplant that history into svn, then the first-time-only sequence is more difficult because you need to transplant all the git history into svn, even though they don't share a common ancestor:

# 1st time only
svn mkdir --parents http://me@svnserver/svn/repo/play/me/fromgit/ProjX
git clone ssh://me@gitserver/git-repo/Projects/ProjX px2
cd px2
git svn init http://me@svnserver/svn/repo/play/me/fromgit/ProjX
git svn fetch

# transplant original git branch onto git-svn branch
root_commit=$( git rev-list --reverse HEAD | head -n1 )
git tag original-git
git reset --hard $root_commit
git reset --soft git-svn
git commit -C $root_commit
# N.B. this bit requires git >= 1.7.2
git cherry-pick $root_commit..original-git
# For older gits you could do
#   git rev-list $root_commit..original-git | xargs -n1 git cherry-pick
# or use git rebase --onto but that requires jumping through some
# hoops to stop moving remotes/git-svn.

Subsequently, do the same svn rebase and dcommit as before.

In case anyone wants to test this approach in a sandbox, you can download my test script. I'd recommend you do a visual security audit before running though ;-)

Adam Spiers
  • 17,397
  • 5
  • 46
  • 65
  • Thanks much. For your `git svn init`, I notice you don't use the `-s` switch. Is that because we won't be using the SVN-style branch structure? Personally I'd be happy to not use SVN branches, but I may have Git branches & release tags that I'd like to expose somehow. If that's not pretty, it's probably not a deal-breaker though. – Ken Williams Oct 24 '12 at 15:28
  • 1
    The second bullet point near the top of my answer explains why I dropped the `-s` switch. Presumably you want to avoid an svn -> git import of the whole svn repository, by narrowing the import to the `fromgit/ProjX` subdirectory. So `-s` cannot be used. Like you say, this excludes use of SVN branches, but not of git branches/tags. If you find my answers helpful, please upvote and accept. Thanks ;) – Adam Spiers Oct 24 '12 at 16:04
  • Oops! Sorry I missed that bullet on my first reading. I've tried out about half your commands so far and they certainly look good enough for an upvote. I'll be back to accept when I've been able to try out the whole litany. – Ken Williams Oct 24 '12 at 19:12
  • Okay, I've tried out all the "more difficult" initialization sequence, and the only thing that's still tripping me up is that the `git cherry-pick` doesn't want to handle the merges. Is there a clean way to automatically handle those, or will I need to go into the directory and manually deal with each merge? – Ken Williams Oct 24 '12 at 20:08
  • 1
    There's no way to automatically resolve merge conflicts - git isn't telepathic ;-) So yes you'll have to resolve each one yourself. You could probably smooth this process out with something like `git rev-list $root_commit..original-git | while read $sha; do git cherry-pick $sha || git mergetool; done`, but that's totally off the top of my head and untested :) – Adam Spiers Oct 24 '12 at 21:19
  • I guess what I don't understand is what's causing the conflict. In my git master, I've got just a simple linear history of changes, some of which were merges (most commonly when someone commits locally, then pulls, then pushes without first doing a rebase), and I thought I already resolved those merges once, when they were first committed. Why is it necessary to re-resolve them? – Ken Williams Oct 25 '12 at 19:21
  • (on that last question, just a pointer to a reference would be great!) – Ken Williams Oct 25 '12 at 19:25
  • http://stackoverflow.com/questions/190431/is-git-svn-dcommit-after-merging-in-git-dangerous ... and please upvote any comments you find helpful, thanks. (Yes, I'm a karma whore ;-) – Adam Spiers Oct 25 '12 at 23:18
  • In fact, just [search SO](http://stackoverflow.com/search?q=git+svn+rebase+merge+conflicts) and you'll find a ton of useful info on this topic. – Adam Spiers Oct 25 '12 at 23:25
  • Status: I've perl-scripted the "more difficult" steps, and I deal with cherry-picking merges by offering to do `git merge -Xtheirs $sha` whenever the cherry-pick fails. That gets me to a `master` that diverges from `origin/master` by 83 and 2 commits, respectively. Sure enough, when I look at the first commit on `master` and `origin/master`, they have different SHAs. Maybe I'd be better merging all these commits instead of cherry-picking? That's what I'm currently experimenting with. – Ken Williams Oct 26 '12 at 19:00
  • Of course, when I do the `git svn rebase` step, git keeps dying with `fork: retry: Resource temporarily unavailable`. Maybe Cygwin is out of file descriptors or process IDs or something. Gonna reboot here... =/ – Ken Williams Oct 26 '12 at 19:03
  • Well, that was a bad idea. Merging makes me lose the commit messages. – Ken Williams Oct 26 '12 at 20:20
  • 1
    You might do better by looking carefully at *why* the cherry-picks are failing, i.e. the actual conflicts generated. Are they all commits you really want to cherry-pick? If so, why is there a conflict? Is one of the parents faulty? Is it an issue with the previous history, or with what you're currently trying to accomplish? Or if you don't want to cherry-pick that commit, how can you automatically avoid it? Root cause analysis will be more productive than blind experimentation... hope that helps. If not, it might merit another SO question. – Adam Spiers Oct 26 '12 at 22:28
  • All of these failing cherry-picks are merge-records with no actual content changes. So there's no actual conflict to investigate as far as I can tell. And in this case, I think it doesn't really matter which parent I choose for the cherry-pick, does it? But you're right, I've put way more in this SO thread than I probably should. – Ken Williams Oct 27 '12 at 19:29
  • So don't cherry-pick those ones :) – Adam Spiers Oct 27 '12 at 22:33
  • Since I've got almost all of this working, but one discrete part that's not, I opened a new question here: http://stackoverflow.com/questions/13147373/git-find-new-cherries – Ken Williams Oct 30 '12 at 20:50
  • @AdamSpiers - What does the 'git-svn' refer to in the command: git reset --soft git-svn; I keep getting "fatal: ambiguous argument 'git-svn': unknown revision or path not in the working tree." – Karra Feb 27 '13 at 02:52
  • It refers to the `git-svn` remote created by `git-svn`. If you download and run the test script, you'll see it there. – Adam Spiers Feb 27 '13 at 03:35
4

I1 would like to split the problem into a couple of issues:

  1. Importing Git history into existing Subversion repository;
  2. Automatic synchronization of Git and SVN repositories afterwards.

My proposal is based on SubGit2.

Importing Git repository into SVN

  1. If you have local access to Subversion repository at svnserver, the setup is pretty much straight-forward and well documented at SubGit Book:

    Let's assume that Subversion repository is located at $SVN_REPO and Git repository is located at $GIT_REPO. First, run the following command:

    $ subgit configure $SVN_REPO
    

    Then adjust generated $SVN_REPO/conf/subgit.conf file as follows:

     [git "ProjX"]
         repository = $GIT_REPO
         translationRoot = /play/me/fromgit/ProjX
         trunk = trunk:refs/heads/master
         branches = branches/*:refs/heads/*
         shelves = shelves/*:refs/shelves/*
         tags = tags/*:refs/tags/*
    

    I'm not quite sure regarding translationRoot option for your case. The value must be a project path relative to SVN repository root, please specify a proper one.

    Don't forget to remove other 'git' sections, so SubGit won't translate those parts of SVN repository into Git.

    Optionally you can adjust $SVN_REPO/conf/authors.txt file in order to specify how Git committer names should be translated into SVN author names:

    SvnAuthor = Git Committer <git.committer@company.com>
    

    Finally, import your Git repository into SVN:

    $ subgit install $SVN_REPO
    

    At this moment $SVN_REPO and $GIT_REPO have special hooks installed by SubGit. Subversion and Git servers trigger these hooks on every incoming modification. This way SubGit automatically synchronizes SVN and Git repositories. Created mirror is bi-directional, i.e. some of developers may use SVN clients others may choose any Git clients.

    In case you don't need such kind of synchronization, just disable it:

    $ subgit uninstall [--purge] $SVN_REPO
    
  2. If you don't have local access to Subversion repository, things get much more tricky but still possible:

    First, fetch the whole SVN repository to your machine, so you can access it locally:

    $ svnadmin create repo
    $ svnrdump dump http://me@svnserver/svn/repo | svnadmin load repo
    

    Now keep in mind the latest revision of fetched repository:

    $ svn info file:///path/to/repo
    Path: repo
    URL: file:///path/to/repo
    Repository Root: file:///path/to/repo
    Repository UUID: cbc56e97-717f-4d50-b705-cb6de2c836eb
    Revision: $LATEST_REVISION
    Node Kind: directory
    Last Changed Author: SvnAuthor
    Last Changed Rev: $LATEST_REVISION
    Last Changed Date: 2012-10-27 14:01:38 +0200 (Sat, 27 Oct 2012)
    

    Then repeat all the steps from a previous clause. So the local SVN repository should store imported Git history afterwards.

    Finally, send the generated history back to SVN:

    $ svnsync initialize http://me@svnserver/svn/repo file://path/to/repo
    $ svn propset --revprop -r0 svn:sync-last-merged-rev $LATEST_REVISION http://me@svnserver/svn/repo
    $ svnsync synchronize http://me@svnserver/svn/repo
    

    Note that remote repository must have pre-revprop-change hook enabled in order to make svnsync work.

Automatic synchronization of Git and SVN repositories

  1. If you have local access to Subversion repository, you can keep SubGit working after you've installed it. As I already mentioned it translates changes immediately after they were sent to one of repositories, so there's no need to maintain any cron job in this case.

  2. If you have no local access to Subversion repository but still would like to use SubGit, I'd recommend you to use svnsync to synchronize SVN repository hosted at svnserver and SVN repository controlled by SubGit3.

    You may find a guide on how to setup such kind of mirror in this HOW-TO.

    Please note that for this case your project should have a separate Subversion repository, otherwise it's very hard to keep svnsync working properly.

  3. If for any reason you decide to use git-svn, I'd recommend you to create a fresh clone of Subversion repository after you've imported Git history with SubGit:

    $ git svn init -s http://me@svnserver/svn/repo/play/me/fromgit/ProjX
    $ git svn fetch
    

    From my experience, importing Git history into SVN is not what git-svn was made for. On the other side SubGit handles that well.

1 Full disclosure: I'm one of SubGit developers.

2 SubGit is a commercial product but it is free for small teams with up to 10 committers. I think it applies for your case.

3 Our team is working on SubGit 2.0 which supports synchronization of SVN and Git repositories located on different hosts. Basically, you can create Git repository anywhere and install SubGit into it with SVN URL specified. After that you can work with any of these repositories — changes get translated automatically between them.

We're going to publish an EAP build with that functionality, so may try it soon.

vadishev
  • 2,979
  • 20
  • 28
  • 1
    Thanks @radio, very helpful to see your explanation. I had considered using SubGit, but didn't see the details previously about being free for up to 10 users. In my case, I don't have local access to the SVN repo, and the repo is being used for other things besides my Git export, so it looks like I might be out of my league using this strategy. Also, we won't have other people committing to SVN in this part of the repo, so we don't need to worry about true 2-way synchronization. – Ken Williams Oct 29 '12 at 21:13
  • @KenWilliams Thanks for the detailed comment. Lack of local access to SVN repository is a very common reason why people can't use SubGit. Hopefully we will fix that with an upcoming release. – vadishev Oct 29 '12 at 21:41