our company have decided to migrate our source code from clearcase to git, that's great :-)
I know that clearcase and git are completely different source code management systems. But we developer, would have only one SCM that containing the complete history.
My colleague found the following tool, which importing our clearcase history into git: https://github.com/charleso/git-cc
Unfortunately our code has more than 46000 source code files and the history to import is more than 10 years.
I analyzed this tool and in my opinion there are two bottlenecks. The first is the import of files from clearcase server. This is easy to solve by doing this in multiple threads. The second is the workflow of git-cc itself.
- Get history of master-branch via cleartool lshistory
- Create changesets of files and group them to comit's
- Get specified version of file(s) from cc server and copy to working directory
- git add .
- git commit
- pick next group and start with 3. again
I think I could improve it by using low level git commands and using multiple threads.
Each commit-group queries its changes from server and creating a blob object within git database, so this could run for multiple groups in multiple threads. Additional I have one thread which create the history in git from just now created blob objects.
My question is now, does this make sense to you or do you think I'm naive?
Have I forget any git locking mechanism?
Have you any other ideas?