1

I have a submodule in git, and I want to be able to have different branches have different URLs for this submodule, specified in .gitmodules. I know that's possible, but whenever I switch between these branches now it's a whole process to get the submodule onto the right hash. I have to deinit the submodule, delete the modules pathway under .git/, and then reinit, which takes a while. Is there an easier way to switch branches and get on the right hash, considering each branch uses a different submodule URL?

phd
  • 82,685
  • 13
  • 120
  • 165
Adam
  • 21
  • 2
  • Use `post-checkout` hook. See https://stackoverflow.com/a/37383406/7976758 and https://github.com/chaitanyagupta/gitutils/blob/973696c7e556d6a67ef890e159dbd0290233a820/submodule-hooks/post-merge-checkout – phd Sep 23 '22 at 18:37
  • Another solution — don't switch branches. Instead have a separate clone for each branch with that branch checked out. Please let me remind that `git worktree` doesn't work with submodules, you need a full clone for every branch. – phd Sep 23 '22 at 18:39

2 Answers2

2

Branches—by which I mean branch names—are not relevant here.1 What matters are the commits ... and, unfortunately in your own case, the specific patterns with which you check out each commit.

A submodule is nothing more or less than a separate Git repository. If you don't have the submodule yet, Git has to run git clone to get it, and Git finds the instructions for that git clone command by combining the information in some specific commit in the form of a gitlink—a path like path/to/submodule whose entry is mode 160000, i.e., a gitlink, with some hash ID—plus information in that same commit in a file named .gitmodules.

Let's take an illustration for example. Suppose the super-repository is named super, and in commit a123456, path/to/sub has mode 160000 and says commit 5555555. You clone the superproject and check out commit a123456. The submodule is not yet cloned, but the index for the superproject says "we need commit 5555555 for path/to/sub" at this point.

If you now run git submodule update --init (or have used recursive clone which does this for you), Git now notices that, hey, the repository isn't cloned yet, so it looks in .gitmodules for path/to/sub to find the instructions for cloning it. It then runs git clone with (in effect) --no-checkout, then enters the submodule and checks out commit 5555555 (by raw hash ID).

But now path/to/sub is associated with a submodule that is cloned. Let's say you now check out commit b789abc in super and it says "use commit 5678567". That goes into Git's index (for super: the submodule sub index and working tree are as yet untouched). If you now run git submodule update, or have turned on recursive checkout (which does this for you), Git notices that the submodule is already cloned, and does not bother to look in the .gitmodules file from b789abc.

Note that, had you not run git submodule update --init while on commit a123456, the submodule would not yet be cloned. So actions in super would now look at the .gitmodules file as it appears in commit b789abc, which might have a completely different URL. Git would then clone that URL and look for commit 5678567 there.

This is the problem you're trying to work around. There isn't a workaround for it, other than the horrible manual process you have been using.

(The usual term for this problem is that you have a "path-dependent outcome". Git doesn't expect this and just doesn't handle it properly.)


1In particular, multiple branch names can point to the same commit. If main and develop both point to commit a123456, it won't matter which name you use, as you get commit a123456 either way. Branch names move over time, so just because main identifies a123456 today does not mean it will identify a123456 tomorrow.

torek
  • 448,244
  • 59
  • 642
  • 775
0

One solution would be to use a different submodule name for each of the different source URLs.

For example, assume that I sometimes want to use package pkg from source url_a, and sometimes use pkg acquired from source url_b. I would first create two independent submodules pkg_from_a and pkg_from_b in my repository:

git submodule add <url_a> pkg_from_a
git submodule add <url_b> pkg_from_b

Now I have two independent submodules, but how do I select the one in use? That is done by creating a symbolic link pkg which points to either pkg_from_a or pkg_from_b as required. For instance, if I start from a branch which should use pkg_from_a, I would create and commit the link like this:

ln -s pkg_from_a pkg     # NOTE: Unix syntax for creating a symbolic link
git add pkg
git commit pkg

Modifying the link is all that would be necessary to switch between the two submodules. The link can change on a commit-by-commit basis, so different branches can easily use pkg sourced from different locations.