0

I'd like to promote the idea of monorepo within my company.
I'd plan to use them this way:

I have one 'parent' repo holding one submodule for each components of our stack, thus maintaining a global versioning for the whole stack (we can simply checkout every components on a given branch)

This sounds perfect because we can still benefit of any CI services out of the box (has we still push on independent git repo, the submodules).

The only (terrible) weakness with this approach, is that if a do a

git submodule update --remote

Using the following config:

[submodule "commonLib"]
   path = commonLib
   url = git@github.com:org/commonLib.git
   branch = MY_BRANCH

Each submodule is effectively check-outed at the right commit.

But: They are all in detached Head

Why there no way to effectively use gitsumodule with branch. i.e: when updating, effectivly check-out the branch and not the commit pointed by this branch ? Is there for a technical reason or simply not yet implemented in git ?

Thanks

Clement
  • 3,860
  • 4
  • 24
  • 36
  • We ran into that identical issue. The problem is that each time a submodule is updated, the master repository needs to update submodules, which will bring head up to date with the master branch. In the end, we combined the core projects into a single repo, which solved the issue. Submodules work best for third party dependencies, and other private projects which don't change often. Ideally, each submodule should build independently, so it can run in CI by itself. You might look at google git-repo, which partially addresses this issue. https://gerrit.googlesource.com/git-repo/ – the_storyteller Oct 18 '17 at 19:33
  • Can you elaborate on the problem your are talking about, I'm not sure to understand it the correct way. – Clement Oct 18 '17 at 19:39
  • From what I understand, if you use branch tracking the checkouted submodule will be pointing directly to the right commit but not to the branch pointing to that commit. Which mean that everyone must take care of checkouting the right branch before working... And that what bother me. – Clement Oct 18 '17 at 19:41

1 Answers1

2

One part of the answer is that git submodules are designed to allow a consistent/coherent view of a set of multiple repositories. And the only way to achieve that is to have each submodule locked at a particular version, with the parent repo tracking all versions for all submodules, thus giving the overall project the appearance of a monorepo.

When working in such project context it doesn't make a lot of sense to have a certain submodule specified just at a branch level because that may pick up a version which isn't consistent with the rest of the project.

Another part of the answer is not specific to the git submodule, but to any git repo: when pulling a specific version the repo will be in detached head state. With not a lot of support for branch identification because in git branches don't have the same meaning and importance as in other version control systems, see this excellent answer for details: https://stackoverflow.com/a/3162929/4495081.

I see 2 possible ways of reducing the risk of human error when picking the right branch at updates:

  • use consistent branch names across all your repositories and labels/tags (ideally produced by your CI/CD system) which have the branch name encoded in their identifier. A quite difficult sell at the beginning, with every component-owning team in desire of complete decision-making power, but it can get better, if/when the teams eventually understand that they actually need alignment to make a coherent product together.

  • provide project-level automation wrappers to operate on the repositories, which would extract the proper branch information from the parent repo (while also performing sanity checks and/or related operations to maintain the developer's workspace and the project consistency).

Dan Cornilescu
  • 39,470
  • 12
  • 57
  • 97