4

I'm currently migrating from SVN to Git. The code base is a 10-15 modules large Maven project. We used to have a repo for each module.

I wonder what architecture should my Git repositories have to handle the following use cases :

  • A user can checkout 1..N module(s).
  • John commits and pushes in module M. Emma pulls the change from the super-folder.
  • John commits and pushes a change of module M from the super-folder. Emma pulls the change from the module folder.
  • John moves (with git mv) file A from M1 to M2, commits and pushes. Emma edits file A update before committing. The file has moved with Emma's change.

I thought to the 'single-repository' architecture but the UC#1 is not handled. The 'submodule', the 'subtree' and the former 'one-module-one-repo' cannot handle the UC#4.

Moreover, if most of the use cases are handled by the 'submodule' architecture, I would like to introduce as few complexity as possible. Submodule introduces concepts like detached-head and may induce painfull repair after more frequent errors.

I did extensive search and I am not sure if it's possible without introducing too much complexity but I hope some of you must had found a workaround.

Remark: Our current SVN architecture cannot handle this use cases.

Thanks a lot, Maxime.

Maxime
  • 1,776
  • 1
  • 16
  • 29
  • Are your modules in a form of SVN externals in the SVN repository? As I understand you want the same functionality in Git. Right? – Dmitry Pavlenko Jul 16 '12 at 15:33
  • Even if my modules are not registered in the super module via externals definition (which basically correspond to the submodule feature in git), it's the spirit. However, I also want Git to handle the use-case #4 – Maxime Jul 16 '12 at 15:36
  • 4
    If you try hard enough, you likely can come up with all sorts of scenarios that git handles differently than subversion. If that's your goal, just stick with subversion. Of course the opposite is also possible - there's situations subversion handles sub-optimally too. If your goal is to move to git, your focus should be on determining the most efficient usage of git, not how to make git match your subversion workflow. – wadesworld Jul 16 '12 at 16:02
  • I must have been misunderstood, I apologize. Currently, SVN does not handle (all) that cases correctly. I'm not looking for flaws in Git, and I just wan't to take advantage of this migration to improve our workflow and make our everyday's work easier. Also, I'm convinced of the Git capabilities and that's why I came here, to get help and find the better solution. – Maxime Jul 17 '12 at 06:52
  • I think SVN would very well handle your use cases if you put all modules in subdirectories of one single repository. So if these are important for you, why not keep SVN and just change your structure? (With `svnadmin dump` and `svnadmin load --parent-dir` you can even keep your history.) – Philipp Wendler Jul 17 '12 at 07:03
  • You're totally right. SVN can handle the use cases a the "single repository" configuration better than Git (because of the sparse checkout capabilities of SVN). However, there is other problems related to this configuration (in Git & SVN). Our project is quiet big (40 devs) and we cannot afford a timeline where the commit messages of every (10-15) modules display in the same place. Poor software architect :p PS : Once again, if somebody proves me (or at least tell my with some clues) that it is impossible, I would be glad to reconsider my technical choices! – Maxime Jul 17 '12 at 07:49

2 Answers2

3

Your analysis is correct; there's no obvious way to address all of those use cases at once. I would suggest one of 2 approaches:

  1. The first requirement of checking out modules individually may not actually be needed. With git you'll pay for the checkout just once when you do an initial clone, but after that incremental updates are very fast.
  2. If you're dealing with Maven modules, perhaps each with its own release cycle, then do the modules really need a source level relationship? If not, then the module dependencies could be represented solely in Maven.

Really, you should probably be starting with a single repository and then splitting things out later if you find it necessary. But you probably won't. :)

Russell Mull
  • 1,151
  • 8
  • 15
  • 1
    Finally an answer ! Not the one I wanted to hear but an answer anyway ;) I agree your first point. However, I forgot to explain why we are migrating to Git. In short, currently merges are anxiogenic and therefore developers try to avoid merges as often as possible. We wan't to break these bad habits and Git seems to be a good solution. But I must keep the use case 4 (move and edit) to give Git a clear lead in the developer eyes and tame that fear of merge. That's why I want a source level relationship. – Maxime Jul 17 '12 at 12:29
  • 1
    +1, start out with a single repo, that is the only way to cover your final use case. – cmcginty Jul 17 '12 at 22:57
  • +1 for point 2. If you are having each module having their own release cycle, it is not suitable to put them as modules. – Adrian Shum Jul 18 '12 at 03:29
  • In the year 2018, I think, VFSForGit might be a solution for the problem. https://github.com/Microsoft/VFSForGit – Ajeeb.K.P Dec 17 '18 at 09:11
1

you can't have all that. not on GIT and not on SVN.

you've already realized that your requirements conflict with each other and even admitted that your current setup does not cover all the situations so you should change the way you're approaching this problem.

instead of demanding certain capabilities from the tool try to explain what are the actual problems that need to be solved and allow people to suggest ways to solve them, chances are those won't be things you've already considered.

I'll try now to answer the problems you've shown on the comments and completely ignore your initial request, I hope it helps more with the actual situation you're in.

we cannot afford a timeline where the commit messages of every (10-15) modules display in the same place

unlike SVN, in GIT you have branches (real branches) and each branch will have its own history so as long as your devs use branches and you merge them instead of using rebase you should be able to isolate each branch log with the appropiate commands, see git log --graph to get the idea.

currently merges are anxiogenic and therefore developers try to avoid merges as often as possible

there's no real solution for this but there's ways to mitigate the problem.

one way is to have several clones of the repo along with the master copy, if your team is about 40 people and you have 10-15 modules then I guess you have small teams there that focus in particular areas/modules; if that's the case then each subteam should have its own clone of the repo and merge locally there before merging back to the master copy.

this approach effectively splits the merge process (and the responsibility) in two phases, one that concerns the changes within the subteam and another that deals with the interaction with the rest of the modules.

But I must keep the use case 4 (move and edit) to give Git a clear lead in the developer eyes and tame that fear of merge

I'll be completely honest, UC#4 is impossible*. particularly on GIT where the mv operation is actually a composition of rm and add.

perhaps if the addition happens before the movement some (d)VCS can figure it out but I don't think that's the case for GIT, even so I think you're taking the wrong way to "tame that fear of merge" let me explain.

* @sleske suggests to check this thread for a way to do UC#4

the reason people fear the merge is because they don't understand it and SVN forces you to merge upstream (that is, on the server) which adds pressure, the problem with your approach is that by trying to help them avoid it you're reinforcing the idea that merges are something obscure and dangerous that should be avoided, don't do that.

if you want them to get over the fear you need to train them so they have the tools to deal with the situation, in other words don't help them avoid the problem force them to solve it, teach them about all the merges and conflict styles, tell them about rerere, even teach them the octopus merge which I've never used but what he hell teach them that too! and then MAKE them practice so it becomes something they know and can handle.

merges in GIT aren't as stressful as with SVN because they're also local so you can do them as many times as you want without fear of screwing other peoiple's environment, you'll only push them once you're absolutely sure they're ok.

that's all for now, if you have additional concerns add a comment and I'll see if I have an idea, good luck!

Community
  • 1
  • 1
Samus_
  • 2,903
  • 1
  • 23
  • 22
  • Hi, Samus. Before everything else, thank you for taking time to answer. Your first point concerning a solution against the commit mess is questionnable. Indeed, you must know that branches are not repositories and any commits of every module on the dev branch will be mixed in the same history. Concerning the merges, I didn't thought about your solution and I think it's pretty good. I need to try it before telling you whether it matches our requirements or not. For the UC#4, it's possible, but only in a 'single-repository' fashion. – Maxime Jul 19 '12 at 07:56
  • Finally, concerning the merges, I would really like to teach them how to use git : "The good way". However, every action should be possible inside the IDE (IntelliJ) which have a pretty good Git integration put IMO does not allow to use correcty rerere. Thanks for everything, I will try to update this post if I found a "good enough" solution. – Maxime Jul 19 '12 at 07:56
  • *"UC#4 is impossible"*: I think that is wrong. If you move a file in one branch and change it in another, git *will* generally merge that automatically. See e.g. http://stackoverflow.com/questions/2701790/git-merge-with-renamed-files – sleske Jul 19 '12 at 15:26
  • nice! notice that he's complaining about the merge conflict (delete/modify) that I was referring to but I didn't knew about `rename-threshold` that might be able to deal with the problem. I'll edit my reply thanks for the tip. – Samus_ Jul 19 '12 at 15:35