Should code coverage be executed EVERY build?

Question

I'm a huge fan of Brownfield Application Development. A great book no doubt and I'd recommend it to all devs out there. I'm here because I got to the point in the book about code coverage. At my new shop, we're using Team City for automated builds/continuous integration and it takes about 40 minutes for the build to complete. The Brownfield book talks all about frictionless development and how we want to ease the common burdens that developers have to endure. Here's what I read on page 130..

"Code coverage: Two processes for the price of one? As you can see from the sample target in listing 5.2, you end up with two output files: one with the test results and one with the code coverage results. This is because you actually are executing your tests during this task.

You don’t technically need to execute your tests in a separate task if you’re running the code coverage task. For this reason, many teams will substitute an automated code coverage task for their testing task, essentially performing both actions in the CI process. The CI server will compile the code, test it, and generate code coverage stats on every check-in.

Although there’s nothing conceptually wrong with this approach, be aware of some downsides. First, there’s overhead to generating code coverage statistics. When there are a lot of tests, this overhead could be significant enough to cause friction in the form of a longer-running automated build script. Remember that the main build script should run as fast as possible to encourage team members to run it often. If it takes too long to run, you may find developers looking for workarounds.

For these reasons, we recommend executing the code coverage task separately from the build script’s default task. It should be run at regular intervals, perhaps as a separate scheduled task in your build file that executes biweekly or even monthly, but we don’t feel there’s enough benefit to the metric to warrant the extra overhead of having it execute on every check-in."

This is contrary to the practice at my current shop were we execute NCover per build. I want to go to my lead and request we not do this, but the best I can do is tell him "this is what the Brownfield book says". I don't think that's good enough. So I'm relying on you guys to fill me in with your personal experiences and advice on this topic. Thanks.

Why do I want to stop it? I'm not advocating that I want to stop it. That's why I asked the question. Honestly I'm not sure. I just know it'd same some time. I'm wanting to know if it's a valid trade off to exercise. — A-Dubb, Jun 27 '11 at 20:14
How do you know it will save some time? NCover should be a concurrent operation with your testing, adding very little overhead. — Nick Bastin, Jun 27 '11 at 20:18
@Nick in this case we're BOTH assuming. What really needs to happen here is for me to figure out exactly how long NCover takes. But you say "should" as if you don't know for sure either. The stats still have to be generated. At what level of parellelism that occurs? I dunno. Do you? It's certainly going to be directly proportional to how many tests you have. — A-Dubb, Jun 27 '11 at 20:25
I'm not really assuming anything - I'm merely asking the questions. I say NCover "should" be a concurrent operation because that's how it is most efficiently set up (and commonly), but I obviously don't have any particular insight into your setup - possibly you're running it serially again after running your tests, which would be incredibly inefficient. — Nick Bastin, Jun 28 '11 at 17:04

Laepdjek · Answer 1 · 2011-06-29T02:39:36.603

There are always two competing interests in continuous integration / automated build systems:

You want the build to run as quickly as possible
You want the build to run with as much feedback as possible (e.g. the most number of tests run, the most amount of information available about the build's stability and coverage, etc)

You will always need to make tradeoffs and find a balance between these competing interests. I usually try to keep my build times under 10 minutes, and will consider build systems broken if it takes more than about 20 minutes to give any sort of meaningful feedback about the build's stability. But this doesn't need to be a complete build that tests every case; there may be additional tests that are run later or in parallel on other machines to further test the system.

If you are seeing build times of 40 minutes, I would recommend you do one of the following as soon as possible:

Distribute the build/testing onto multiple machines, so that tests can be run in parallel and you can get faster feedback
Find things that are taking a lot of time in your build but are not giving a great amount of benefit, and only do those tasks as part of a nightly build

I would 100% recommend the first solution if at all possible. However, sometimes the hardware isn't available right away and we have to make sacrifices.

Code coverage is a relatively stable metric, in that it is relatively rare that your code coverage numbers would get dramatically worse within a single day. So if the code coverage is taking a long time to perform, then it's not really critical that it occurs on every build. But you should still try to get code coverage numbers at least once a night. Nightly builds can be allowed to take a bit longer, since there (presumably) won't be anybody waiting on them, but they still provide regular feedback about your project's status and ensure there aren't lots of unforeseen problems being introduced.

That said, if you are able to get the hardware to do more distributed or parallel building/testing, you should definitely go that route - it will ensure that your developers know as soon as possible if they broke something or introduced a problem in the system. The cost of the hardware will quickly pay itself back in the increased productivity that occurs from the rapid feedback of the build system.

Also, if your build machine is not constantly working (i.e. there is a lot of time when it is idle), then I would recommend setting it up to do the following:

When there is a code change, do a build and test. Leave out some of the longer running tasks, including potentially code coverage.
Once this build/test cycle completes (or in parallel), kick off a longer build that tests things more thoroughly, does code coverage, etc
Both of these builds should give feedback about the health of the system

That way, you get the quick feedback, but also get the more extended tests for every build, so long as the build machine has the capacity for it.

score 2 · Answer 2 · answered Jun 27 '11 at 20:09

2

I wouldn't make any presumptions about how to fix this - you're putting the cart before the horse a bit here. You have a complaint that the build takes too long, so that's the issue I would ask to resolve, without preconceived notions about how to do it. There are many other potential solutions to this problem (faster machines, different processes, etc.) and you would be wise not to exclude any of them.

Ultimately this is a question of whether your management values the output of the build system enough to justify the time it takes. (And whether any action you might take to remedy the time consumption has acceptable fidelity in output).

answered Jun 27 '11 at 20:09

Nick Bastin

30,415
7
59
78

This isn't a complaint. But I know for sure that one less task is going to cut time off the build. It's a logical thought. If I have 5 methods being executed for a particular algorithm, and I remove one of them, I'm guaranteed to cut down on time. Even if it's one nanosecond. The reason I single out code coverage is because it's kind of one of those things you can manage to go without for a few builds. Unit tests on the other hand should be executed EVERY build. – A-Dubb Jun 27 '11 at 20:12
@A-Dubb: Why do you value unit tests more than coverage? If you think unit tests are absolutely essential, wouldn't you also think it's essential to verify that you're running unit tests that actually cover all your code? e.g., why is it acceptable to potentially run a set of unit tests that aren't effective. It also seems very odd that NCover massively extends the time if you're already running your unit tests - this should be a single operation. – Nick Bastin Jun 27 '11 at 20:14
1

Also, maybe you know how much time it takes, but you certainly haven't established it here. Sure, cutting 1 method out of an algorithm will save you time, but if 99% of the time is spent in the other 4, your time would probably be better spent investigating other solutions. – Nick Bastin Jun 27 '11 at 20:16
Yes. That's true. I'm not sure exactly how long NCover takes to execute but I'm willing to bet the time it takes is enough to alleviate a worthy amount of pain in the event that it's executed less often. I'll look into the exact numbers though. – A-Dubb Jun 27 '11 at 20:19

score 1 · Answer 3 · answered Jun 27 '11 at 20:10

1

This is a per team and per environment decision. You should first determine your threshold for build duration, and then factor out longer running processes into less-frequent occurrences (ideally no fewer than 1 or 2 times a day in CI) once that has been determined.

answered Jun 27 '11 at 20:10

Ethan Cabiac

4,943
20
36

So you're implying that it shouldn't be executed for each and every build, but however consolidated into twice a day at most? – A-Dubb Jun 27 '11 at 20:16
Umm...no. Once or Twice **at least** and **only** if it exceeds your predetermined threshold for build length. – Ethan Cabiac Jun 27 '11 at 20:18

Ira Baxter · Answer 4 · 2011-06-29T02:40:30.770

The objection appears to be that executing all the tests, and collecting code coverage, is expensive, and you don't (well, someone doesn't) want to pay that price for each build.

I cannot imagine why on earth you (or that someone) would not want to always know what the coverage status was.

If the build machine has nothing else to do, then it doesn't matter if it does this too. If your build machine is too busy doing builds, maybe you've overloaded it by asking it to serve too many masters, or you are doing too many builds (why so many changes? hmm, maybe the tests aren't very good !).

If the problem is that the tests themselves really do take a long time, you can perhaps find a way to optimize the tests. In particular, you shouldn't need to re-run tests for the part of the code that didn't change. Figuring out how to do this (and trusting it) might be a challenge.

Some test coverage tools (such as ours) enable you to track what tests cover which part of the code, and, given a code change, which tests need to be re-run. With some additional scripting, you can simply re-run the tests that are affected first; this enables you to get what amounts to full test results early/fast without running all the tests. Then if there are issues with the build you find out as soon as possible.

[If you are paranoid and don't really trust the incremental testing process, you can run them for the early feedback, and then go on to run all the tests again, giving you full results.]

Fair enough. There seems to be a lingering sentiment that code coverage is simply to important to ignore on a per build basis. I don't have a particular stance on the matter. This is more of an open ended question to see what you guys think. I simply wanted feedback from you guys to get an idea of the most common preference. — A-Dubb, Jun 28 '11 at 16:55

Should code coverage be executed EVERY build?

4 Answers4