0

Background

Our team is working with a codebase stored in CVS, which is managed by the client.
There is no possible way to convince the client to switch to Git in any forseeable future.

Therefore, to be able to use the obvious advantages of git, we want to make a git copy of the CVS repo, use it and keep both repos synchronized.

First attempt

Our first attempt to resolve the problem was to:

  • checkout CVS repo
  • git init and git commit in the same directory to import entire tree at once
  • for each new feature, we make a git branch, do the stuff, use git-cvsexport, and use cvs to commit the feature
  • every half an hour we run (from cron) cvs update, git add . and git commit to transfer new stuff that apeared in cvs to git

This approach has drawbacks - the main problem is that history on CVS and git are not alike.

Planned changes

So we plan to switch to git cvsimport in a manner described here, more or less.

Issue

Still, we can image such a scenario:

  • we have commit Ac on cvs and Ag (made by git cvsimport) on master branch on git, which are alike
  • we make a feature branch, on which we make commit Xg
  • we use git cvsexportcommit and cvs commit to make commit Xc on cvs
  • cvs parses $Id $ sections in commited file, and makes actual changes to the file
  • we run git cvsimport to import changes from cvs. This would transfer Xc to git's master branch and make a commit Xg'

Question

How do we tell git that commits Xg and Xg' are actually the same thing?
According to this post, there seems to be no way to do that, as content is a crucial part of commit's id, and git only uses ids to identify the commits.

Workarounds

To mitigate the issue, I thought of a following solutions:

  • Instead of issuing git cvsimport, we could use cvsps to make patches, and skip commits, which were created by our team, which we would recognize by author's email. This would not create commit Xg', so we'd have to take extra care of appropriate branches' merging.
  • We would never merge feature branches into master, so we'd never have conflicts. Seems easy, But i guess that would make feature branches a bit less useful, and still, we'd have no clear git history.

Bonus question

We assumed that we would run git cvsexportcommit from feature branch. Would it be better (meaning easier to maintain) to merge the feature to master branch first, and them issue git cvsexportcommit? Would it make any real difference?

Or perhaps, our entire idea is wrong, and we should consider using an alternate solution?

Thank you in advance.

Community
  • 1
  • 1
rubikonx9
  • 1,403
  • 15
  • 27
  • Are branches both readable and writable in both cvs and git, or are some git-writes-only and others cvs-writes-only? (You're asking for a world of trouble if both have to be able to write.) I don't want to give a long detailed reply only to find out that I'm replying to the wrong question. Oh, I see this is from Dec 10. Perhaps you have solved your issue already. – Mort Apr 25 '16 at 00:08
  • We're using a semi-workaround-approach - we use cvsexportcommit from feature branches, but this fails in many cases, so we need to make manual adjustments quite often. And yes, both need to be writable :) Anyways, we've managed to talk the client into migrating to git, so there's a new hope... – rubikonx9 Apr 25 '16 at 09:18

1 Answers1

0

Yeah, bidirectional live cvs<-->git mirroring is, from our experience, a delicate and error-prone thing that requires manual babysitting. I only had it going one way (commits in cvs, mirrored live to read-only git) and that was tough enough to keep correct.

Our repository was large enough that there was no tool that could do incremental updates in a timely enough manner as most (at least that we found) don't really do "incremental", they effectively start from scratch again. So we rolled our own, using viewvc to create a database view of the cvs repo from which we could then generate individual commits in git.

You should be able to use the git cvsimport -k to kill CVS keywords on the git side.

I would imagine that yes, maintaining a list of the members of your own team and having the cvs-->git tool execute special code for those commits coming back into git again would make a lot of sense.

I would think it would be simpler if you essentially made the sync one way, by doing all of your work in branches that are only written to in git. Regular commits would be trivially synced into CVS. You could keep up to date in git with periodic merges with master. (Or whatever CVS branch you're based on.) The merge commit would just be committed back to CVS as a "merge branch dev-git with master" commit lacking git's merge-aware merge commit stuff. Then you could have a periodic merge window where you arrange with the CVS teams to allow you to do the reverse: make a single commit into the CVS branch with your changes. You'd tag your "dev-git" branch with something meaningful say "feature_xyx_20160419" and the commit comment in cvs would be "merge dev-git up to feature_xyz_20160419".

Mort
  • 3,379
  • 1
  • 25
  • 40