86

I've been dealing with the problem of scaling CI at my company and at the same time trying to figure out which approach to take when it comes to CI and multiple branches. There is a similar question at stackoverflow, Multiple feature branches and continuous integration. I've started a new one because I'd like to get more of discussion and provide some analysis in the question.

So far I've found that there are 2 main approaches that I can take (or maybe some others???).

So it seems if I want to provide devs with CI for their own custom branches I need special tooling for Jenkins (API or shellscripts or something?) and handle scaling. Or I can tell them to merge more often to DEV and live without CI on custom branches. Which one would you take or are there other options?

Community
  • 1
  • 1
toomasr
  • 4,731
  • 2
  • 33
  • 36

4 Answers4

71

When you talk about scaling CI you're really talking about scaling the use of your CI server to handle all your feature branches along with your mainline. Initially this looks like a good approach as the developers in a branch get all the advantages of the automated testing that the CI jobs include. However, you run into problems managing the CI server jobs (like you have discovered) and more importantly, you aren't really doing CI. Yes, you are using a CI server, but you aren't continuously integrating the code from all of your developers.

Performing real CI means that all of your developers are committing regularly to the mainline. Easy to say, but the hard part is doing it without breaking your application. I highly recommend you look at Continuous Delivery, especially the Keeping Your Application Releasable section in Chapter 13: Managing Components and Dependencies. The main points are:

  • Hide new functionality until it's finished (A.K.A Feature Toggles).
  • Make all changes incrementally as a series of small changes, each of which is releasable.
  • Use branch by abstraction to make large-scale changes to the codebase.
  • Use components to decouple parts of your application that change at different rates.

They are pretty self explanatory except branch by abstraction. This is just a fancy term for:

  1. Create an abstraction over the part of the system that you need to change.
  2. Refactor the rest of the system to use the abstraction layer.
  3. Create a new implementation, which is not part of the production code path until complete.
  4. Update your abstraction layer to delegate to your new implementation.
  5. Remove the old implementation.
  6. Remove the abstraction layer if it is no longer appropriate.

The following paragraph from the Branches, Streams, and Continuous Integration section in Chapter 14: Advanced Version Control summarises the impacts.

The incremental approach certainly requires more discipline and care - and indeed more creativity - than creating a branch and diving gung-ho into re-architecting and developing new functionality. But it significantly reduces the risk of your changes breaking the application, and will save your and your team a great deal of time merging, fixing breakages, and getting your application into a deployable state.

It takes quite a mind shift to give up feature branches and you will always get resistance. In my experience this resistance is based on developers not feeling safe committing code the the mainline and this is a reasonable concern. This in turn usually stems from a lack of knowledge, confidence or experience with the techniques listed above and possibly with the lack of confidence with your automated tests. The former can be solved with training and developer support. The latter is a far more difficult problem to deal with, however branching doesn't provide any extra real safety, it just defers the problem until the developers feel confident enough with their code.

ndyer
  • 952
  • 2
  • 11
  • 15
Tom Howard
  • 6,516
  • 35
  • 58
  • 4
    Tom, this only works well if 1) both the release and update are comparatively easy 2) most of your changes are well isolated. This is true for web dev, but if you're doing boxed product releases then stable versions have to remain stable at all costs, cause hotfixes are really expensive or even impossible in a large corporate environment. – Jevgeni Kabanov Apr 19 '11 at 06:33
  • 13
    real CI isn't only about integrating, it is also about the feedback – Anton Arhipov Apr 19 '11 at 06:33
  • 3
    I chose this as the answer (at least gave the bounty, please let me know if I somehow still need to mark it correct) but I think this is not a solution for my problem. I've written a followup at http://www.zeroturnaround.com/blog/continuous-integration-and-feature-branches/ – toomasr Apr 20 '11 at 13:10
  • 1
    @Jevgeni Kabanov and @toomasr Both of you seem to assume that doing true CI means relinquishing quality and it works for Web dev only, because it's so easy to push out fixes. I'm guessing that what you are worried about is a dodgy commit just before a release. Yes this can result in a bad release which can be expensive to fix. However a dodgy commit on a feature branch just before it's released is just as bad. If you feel there is a difference please share your reasoning. One way to combat this (if the commit was to the mainline or a feature branch) is to use the Continuous Delivery approach. – Tom Howard Apr 26 '11 at 14:16
  • 1
    Oh, and BTW, for the just over the last 4 years my main development experience has been at financial institutions. The imperative to have stable releases and the cost of getting it wrong (not to mention the change contol process you need to go through to push out a hotfix) doesn't get much greater than that. A boxed product would be a relaxing change for me. – Tom Howard Apr 26 '11 at 14:28
  • The challenges with boxed products are almost always connected to the fact that they must work in the uncontrolled environment. This is complicated by the fact that we have two products from the same codebase with different release cycles. I'll look into Continuous Delivery, but I'm not too optimistic about that. – Jevgeni Kabanov Apr 27 '11 at 06:58
  • @Jevgeni Kabano yep, uncontrolled environments are hard to work with, but I don't see how SCM branching helps. Also, you made the call in your initial comment that CI works only when the changes are well isolated. CI is even more important when your changes are not well isolated as it allows you to deal with changes that other people are making as they happen, not during some big ugly merge down the track, they you have no way to estimate how long or how complex it will be. – Tom Howard Apr 27 '11 at 22:14
  • @toomasr, it looks like you haven't marked it as the correct answer. – Tom Howard Jun 13 '11 at 20:25
  • Marked as correct now, sorry about that. Weird is that I've checked the checkbox to get notified via email about further comments but only today did I stumble upon the last 4 comments. Nothing in my spam folder either. Anyways, case closed now. – toomasr Jun 14 '11 at 06:23
  • @toomasr, cheers. Also, I've read [your article over at zeroturnaround](http://www.zeroturnaround.com/blog/continuous-integration-and-feature-branches/) and I think you might be asking the wrong question. You might want to try "Handling multiple branches in automated builds" instead. I would avoid any mention of CI because it appears that's not what you want to do. – Tom Howard Jun 14 '11 at 08:36
4

I would set up separate jobs for each branch. I've done this before and it isn't hard to manage and set up if you've set up Hudson/Jenkins correctly. A quick way to create multiple jobs is to copy from an existing job that has similar requirements and modify them as needed. I'm not sure if you want to allow each developer to setup their own jobs for their own branches, but it isn't much work for one person (i.e. a build manager) to manage. Once the custom branches have been merged into stable branches, corresponding jobs can be removed when they are no longer necessary.

If you're worried about the load on the CI server, you could set up separate instances of the CI or even separate slaves to help balance the load across multiple servers. Make sure that the server you are running Hudson/Jenkins on is adequate. I've used Apache Tomcat and just had to ensure that it had enough memory and processing power to process the build queue.

It's important to be clear on what you want to achieve using CI and then figure out a way to implement it without much manual effort or duplication. There's nothing wrong with using other external tools or scripts that are executed by your CI server that help simplify your overall build management process.

Bernard
  • 7,908
  • 2
  • 36
  • 33
  • I think this lack of tooling means there is room for some plugins/products in this department. Would not want to write my own. – toomasr Apr 14 '11 at 07:53
  • 1
    There is utility for Jenkins that creates build configuration for each branch automatically: http://entagen.github.com/jenkins-build-per-branch/ – kolen Nov 09 '12 at 13:21
3

I would choose dev+stable branches. And if you still want custom branches and afraid of the load, then why not move these custom ones to the cloud and let developers manage it themselves, e.g. http://cloudbees.com/dev.cb This is the company where Kohsuke is now. There is an Eclipse Tooling also, so if you are on Eclipse, you will have it tightly integrated right into dev env.

  • Will I be trading the lack of tooling of managing multiple branches to having the same problem but on the cloud? I mean I will be able to manage the load now but still not the branches? – toomasr Apr 14 '11 at 07:50
  • I meant forget the tooling and distribute the management amongst developers - "if you want a custom personal build, here is your CB account". Without affecting the build performance of the main server. Though their API is pretty simple, so creating management utils would be a matter of one-two weeks probably, and then you do there whatever you want. As it is usual in life, if you want something special you're better off doing it yourself. At the same time they are growing fast and listening to the community, so fill in a feature request and may be it appears soon. – Anton Safonov Apr 14 '11 at 10:10
  • Oh, understood. Tell the branch owner the cherry pick the jobs he is interested in and set them up for his custom branch as he wants. I like this idea. – toomasr Apr 15 '11 at 07:44
1

Actually what is really problematic is build isolation with feature branches. In our company we have a set of separate maven projects all be part of a larger distribution. These projects are maintained by different teams but for each distribution all projects need to be released. A featurebranch may now overlap from one project to another and thats when build isolation gets painfully. There are several solutions we've tried:

  • create separate snapshot repositories in nexus for each feature branch
  • share local repositories on dedicated slaves
  • use the repository-server-plugin with upstream repositories
  • build all within one job with one private repository

As a matter of fact, the last solution is the most promising. All other solutions lack in one or another way. Together with the job-dsl plugin it is easy to setup a new feature branch. simply copy and paste the groovy script, adapt branches and let the seed job create the new jobs. Make sure that the seed job removes nonmanaged jobs. Then you can easily scale with feature branches over different maven projects.

But as tom said well above, it would be nicer to overcome the necessity of feature branches and teach devs to integrate cleanly, but that is a longer process and the outcome is not clear with many legacy system parts you won't touch any more.

my 2 cents

matt
  • 4,027
  • 2
  • 25
  • 32
prosailor
  • 31
  • 2