19

This sonar page basically lists the various methods employed by different code coverage analysis tools:

  1. Source code instrumentation(Used by Clover)
  2. Offline byte code instrumentation(Used by Cobertura)
  3. On-the-fly byte code instrumentation(Used by Jacoco)

What are these three methods and which one is the most efficient and why?If the answer to the question of efficiency is "it depends" , then please explain why?

Geek
  • 26,489
  • 43
  • 149
  • 227

2 Answers2

16

Source code instrumentation consists in adding instructions to the source code before compiling it. These instructions are used to trace which parts of the codes have been executed.

Offline byte-code instrumentation consists in adding those same instructions, but after the compilation, directly into the byte-code.

On-the-fly byte-code instrumentation consists in adding those same instructions in the byte-code, but dynamically, at runtime, when the byte-code is loaded by the JVM.

This page has a comparison between the methods. It might be biased, since it's part of the Clover documentation.

Depending on your definition of "efficient", choose the one you like the most. I don't think you'll get enormous differences. They all do the job, and the big picture will be the same whatever the method used.

JB Nizet
  • 678,734
  • 91
  • 1,224
  • 1,255
3

In general the effect on coverage is the same.

Source code instrumentation can give superior reporting results, simply because byte-code instrumentation cannot distinguish any structure within source lines, as the code block granularity is only recorded in terms of source lines.

Imagine I have two nested if statements (or equivalently, if (a && b) ... *) in a single line. A source code instrumenter can see these, and provide coverage information for the multiple arms within the if, within the source line; it can report blocks based on lines and columns. A byte code instrumenter only sees one line wrapped around the conditions. Does it report the line as "covered" if condition a executes, but is false?

You may argue this is a rare circumstance (and it probably is), and is therefore not very useful. When you get bogus coverage on it followed by a field failure, you may change your mind about utility.

There's a nice example and explanation of how byte code coverage makes getting coverage of switch statements right, extremely difficult.

A source code instrumenter may also achieve faster test executions, because it has the compiler helping optimize the instrumented code. In particular, a probe inserted inside a loop by a binary instrumenter may get compiled inside the loop by a JIT compiler. A good Java compiler will see the instrumentation produces a loop-invariant result, and lift the instrumentation out of the loop. (A JIT compiler can arguably do this too; the question is whether they actually do so).

Community
  • 1
  • 1
Ira Baxter
  • 93,541
  • 22
  • 172
  • 341
  • Actually, the reason that tools like Cobertura and JaCoCo don't show intra-line coverage information is simply that the developers chose not to implement it. In my own on-the-fly bytecode instrumentation tool (JMockit Coverage), this *is* implemented and separate line segments (such as in "`if (a && b)`") *are* shown as such in the coverage report. – Rogério Sep 02 '14 at 18:03
  • That's the way all tools are: "the developers chose not to implement (some feature)". Good that you have more enthusiasm. How do you distinguish the parts of the line? I didn't think the class files gave information at any finer grain of detail. – Ira Baxter Sep 02 '14 at 18:52
  • At the bytecode level, jump instructions are the basis for separating multiple executable segments in a line of code. Each jump instruction has a target instruction, which may or may not be in the same line; this information is made available by the ASM library which is used for bytecode manipulation. For each line of code, a list of the jump instructions and their targets is kept, and made available to the HTML report generator at a later time, which then parses each *source* line of code while matching bytecode branches to individual line segments. – Rogério Sep 02 '14 at 20:12
  • @Rogério: That must have been fun to implement. How can that work in the face of code optimizations/simplifications and code motion? (Another user reported that the condition test for "x==null" is left out by the compiler, if it knows that "x!=null" is true when it evaluates the condition "x==null". How do you match things up reliably?) – Ira Baxter May 02 '15 at 03:44
  • I didn't run into any such difficulties simply because there are no optimizations/simplifications made at the bytecode level; Java compilers generate standardized and unoptimized bytecode (with a few minor differences between javac and the Eclipse compiler), and that's (plus the source code) is all the coverage tool needs to work with. JIT optimizations don't interfere. I think the "other user" is mistaken; if there is a "x==null" condition in the source, it will always be in the bytecode. – Rogério May 04 '15 at 15:21
  • But isn't using instrumentation, means that we have build which includes the instrumentation printings and another build for operational (without printing). So someone can always say, that the test were done on a different build , Right ? – ransh Aug 01 '17 at 19:52
  • @ransh: Yes, they can *say* that. Well designed instrumentation does not affect the functionality of the program, unless there is something explicit in the program to check for presence of that instrumentation to cause some behavioral change, and one can be reasonably certain in a cooperative engineering process that such checks do not exist except in bizarre ways (e.g., inspecting the program size, which will change due to added code volume from instrumentation). ... – Ira Baxter Aug 03 '17 at 00:13
  • @ransh: ... The instrumentation also affects performance (for our tools in Java, by adding about 10-15% overhead) and that may also affect functionality. Most applications have a lot of performance headroom so this tends not to be a big problem in practice. OK, so one can *say* that one tested something other than the uninstrumented build, but the practical effect is as if you had collected test coverage data on the production build. If you insist, and you dont mind the overhead, you can in fact deploy the instrumented code in production :-) – Ira Baxter Aug 03 '17 at 00:17