33

I'm contributing to a fairly small open source project hosted on Github. So that other people can take advantage of my work, I've created my own fork on Github. Despite Github's choice of terminology, I don't wish to totally diverge from the main project. However, I don't expect or desire that all of my work is accepted into the main repository. Some of it however, already has been merged into the main repository and I expect this to continue. The problem I am running into is how best to keep our two trees in a state where code can be shared between them easily.

Some situations I have or will encountered include:

  • I commit code that is later accepted into the main repository. When I pull from this repository in the future, my commit is duplicated in my repository.
  • I commit code that is never accepted into the main repository. When I pull from this repository in the future, the two trees have diverged and fixing it is hard.
  • Another person comes along and bases their work on my repository. Thus, I should if at all possible avoid changing commits that I have pushed, for example by using git rebase.
  • I wish to submit code to the master repository. Ideally, my changes should easily be able to be transformed into patches (ideally using git format-patch) that can directly and cleanly apply to the master repository.

As far as I can tell there are two, or possibly three ways to handle this, none of which work particularly well:

  • Frequently run git rebase to keep my changes based off the head of the upstream repository. In this way I can eliminate duplicated commits but often have to rewrite history, causing problems for people wanting to derive their work from mine.
  • Frequently merge the upstream repository changes into mine. This works ok on my end but does not seem to make it easy to submit my code to the upstream repository.
  • Use some combination of these and possibly git cherry-pick to keep things in order.

What have other people done in this situation? I know my situation is analogous to the relationship between various kernel contributors and Linus's main repository, so hopefully there are good ways to handle this. I'm fairly new to git though, so haven't mastered all it's nuances. Finally, especially due to Github, my terminology may not be entirely consistent or correct. Feel free to correct me.

orangejulius
  • 989
  • 1
  • 10
  • 23
  • Note that even if you (force) push your rebased changes, other people can also easily pull with rebase to update. It's just another workshop where history is continuously being rewritten everywhere. On top of that you need to be a little more careful due to the force push being able to wipe out everything :) – rubenvb Jan 30 '18 at 07:27

1 Answers1

17

Some tips I've learned from a similar situation:

  • Have a remote tracking branch for the upstream author's work.
  • Pull changes from this tracking branch into your master branch every so often.
  • Create a new branch for each of the topics you're working on. These branches should generally be local only. When you get changes from upstream into master, rebase your topic branches to reflect these changes.
  • When you're done with some topic work, merge into master. This way, people who are deriving work from yours, will not see too much rewritten history, since the rebasing occurred in your local topic branches.
  • Submitting changes: Your master branch will basically be a series of commits, some of which are the same as upstream, the rest are yours. The latter can be sent as patches if you want to.

Of course, your choice of branch names and remotes are your own. I'm not sure these are exhaustive to the scenario, but they cover most of my hurdles.

sykora
  • 96,888
  • 11
  • 64
  • 71
  • In the time since I asked this question I've gained a lot more experience with Git and have approximately settled on this as the best answer. Keeping changes under development only as local commits is the key. Definitely a very good tip. Thanks. – orangejulius Sep 17 '09 at 05:06
  • Have you ever worked on a topic branch on more than one computer? If so, what did you do? – Andrew Grimm Oct 13 '09 at 02:18
  • 1
    No, I haven't worked on topic branches across computers, but it's not the "across computers" part that is difficult. It's the "across people" that is. You can just as easily push your topic branches to the server so _you_ can work with it. – sykora Oct 13 '09 at 02:36
  • What if you have a team of programmers working on different topic branches on the same externally maintained open-source project? You would then have a company git server where all the programmers push. You will have the master branch there with a lot of merge commits from their topic branches. Then when you send the patches upstream, you will most probably squash those merge commits into single commits. Then if they are accepted, you get them again from upstream. Won't they clash with the commits you already have on your git server? What's the most clean way to conduct such development? – Alexander Amelkin Feb 20 '18 at 17:53