23

I'm using Git subtree to "link" several subprojects into the main project (I'm coming from "svn:externals"). I've used it for some weeks, but the time to push changes to the subtree remote increases every commit.

$ git subtree push -P platform/rtos rtos master

git push using:  rtos master

1/    215 (0)2/    215 (1)3/    215 (2)4/    215 (3)5/    215 (4)6/    215 (5)7/    215 (6)8/    215 (7)9/    215 (8)10/    215 (9)11/    215 (9)12/    215 (10)13/    215 (11)14/    
...
20 more lines
...
(204)209/    215 (205)210/    215 (206)211/    215 (207)212/    215 (208)213/    215 (209)214/    215 (210)215/    215 (211)To https://github.com/rtos/rtos.git
   64546f..9454ce  9a9d34c5656655656565676768887899898767667348590 -> master

Is there any way to "clean up" the subtree and therefore reduce the time to push changes?

Anthony Mastrean
  • 21,850
  • 21
  • 110
  • 188
ferraith
  • 899
  • 1
  • 8
  • 19

5 Answers5

19

Try using the --rejoin flag, so that after the split the subtree is correctly merged back to your main repository. This way each split needs not to go through all history.

git subtree split --rejoin --prefix=<prefix> <commit...>

From the original subtree documentation:

After splitting, merge the newly created synthetic history back into your main project. That way, future splits can search only the part of history that has been added since the most recent --rejoin.

Maic López Sáenz
  • 10,385
  • 4
  • 44
  • 57
  • 5
    And use which ? I tried with the tip of that subtree and though it did the split/rejoin (not even sure what that means), my next subtree push was just as lengthy :( – Jorge Orpinel Pérez Mar 09 '15 at 19:59
  • 2
    Why would this work (can you explain what's happening when you use split/rejoin)? and how often do I need to run this (once? once-per-some-unspecified-event? before every push?) – Anthony Mastrean Feb 24 '20 at 16:28
7

No, unfortunately not. When you run git subtree push, it will recreate all commits for this subtree. It has to do that, as their SHA depends on the previous commit and needs those SHAs to be able to link the new commits to the old ones. It could cache that, but it doesn’t.

I guess this is the price you pay for using subtree vs. submodules. Subtree is very stateless in your repository, which is nice on the one hand but causes these long computations on the other hand. Submodules store all their information which requires you to manage it but also makes stuff like this a lot faster.

Chronial
  • 66,706
  • 14
  • 93
  • 99
  • 8
    It is surprising `git subtree` can't (doesn't?) just cache this information, so that after the first `git subtree push` everything becomes relatively snappy. – davidg Jul 03 '14 at 06:40
  • @davidg It surely could. – Chronial Jul 03 '14 at 10:21
  • 3
    Especially on Windows this is ridiculously slow. In my case when you hit the 600 assertion count it takes over a minute for each push making git useless at that point. – Jorge Orpinel Pérez Mar 10 '15 at 15:11
0

@LopSae 's answer works but it will cloud your repo with lots of commits if you use squash to merge/pull

Here is the way to avoid that

git subtree split --rejoin --prefix=<subtree/path> --ignore-joins

When you do this, you want to push back the branch you pull from. Otherwise you can't create a pull request.

There are 2 options to handle this at least.

  1. Rebase

  2. Create a new branch on remote project and pull it and push back to that branch.

maxisam
  • 21,975
  • 9
  • 75
  • 84
  • The question is about `git subtree push` and not `split`, I don't think the `rejoin` option is applicable. – Anthony Mastrean Feb 21 '20 at 20:13
  • 2
    @AnthonyMastrean ah....git subtree push does a split first. Please check the original document. And if you really try it, you can see how it works. – maxisam Feb 23 '20 at 09:58
  • I looked at the source for [git-subtree.sh](https://github.com/git/git/blob/51ebf55b9309824346a6589c9f3b130c6f371b8f/contrib/subtree/git-subtree.sh#L894) and, yep, there's a `subtree split` in there. I'm still not following the mechanism for your answer though... if we setup a split/rejoin manually... that'll make subsequent pushes work better? – Anthony Mastrean Feb 24 '20 at 15:53
  • it create a split point and calculate from there instead from the beginning. Do you ever try it? – maxisam Feb 25 '20 at 16:14
  • @maxisam I tried it just now, and it does not work. -1 – Qqwy Sep 18 '21 at 09:46
-2

Maybe this helps: I think you can tell git subtree split to only go back n commits by doing

git subtree split --prefix XXX HEAD~n..

or by specifying the commit you want to start with for example

git subtree split --prefix XXX 0a8f4f0^..

This helps reduce the time, though it's inconvenient.

ehremo
  • 387
  • 2
  • 8
-6

Note that if you decide to switch to git submodule, you now can, since git1.8.2 (2013-03-08) track the latest commits of a submodule repo.

See git externals.

"git submodule" started learning a new mode to integrate with the tip of the remote branch (as opposed to integrating with the commit recorded in the superproject's gitlink).

That could be make for quicker push, while benefiting from the additional information a submodule has over a subtree (i.e a lightweight record of a specific commit of the submodule)

You can update that submodule to the latest of a given branch with:

git submodule update --remote

This option is only valid for the update command.
Instead of using the superproject's recorded SHA-1 to update the submodule, use the status of the submodule's remote tracking branch.

Community
  • 1
  • 1
VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250