7

I'm trying to understand the output of the gcov tool. Running it with no options makes sense, but I'm wanting to try and understand the branch coverage options. Unfortunately it's hard to make sense of what the branches do and why they aren't taken. Below is the output for a method (compile using the latest LLVM/Clang build).

function -[TestCoverageAppDelegate loopThroughArray:] called 5 returned 100% blocks executed 88%
        5:   30:- (NSInteger)loopThroughArray:(NSArray *)array {
        5:   31:    NSInteger i = 0;
       22:   32:    for (NSString *string in array) {
branch  0 taken 0
branch  1 taken 7
        -:   33:        
       22:   34:    }
branch  0 taken 4
branch  1 taken 3
branch  2 taken 0
branch  3 taken 3
        5:   35:    return i;
        -:   36:}

I've run 5 test through this, passing in nil, an empty array, an array with 1 object, and array with 2 objects and an array with 4 objects. I can guess that in the first case, branch 1 means "go into the loop" but I haven't a clue what branch 0 is. In the second case branch 0 seems to be loop through again, branch 1 seems to be end the loop and branch 3 is continue/exit the loop, but I have no idea what branch 2 is or why/when it would be executed.

If anyone knows how to decipher the branch info, or knows of any detailed documentation on what it all means, I'd appreciate the help.

Martin Pilkington
  • 3,261
  • 22
  • 16
  • Try to get an assembly of your function and check number of `j**` instructions in it. – osgx Aug 14 '11 at 22:03
  • My assembly is not very good but it seems there are 4. The first is a je which I believe skips over the loop if there are no objects to enumerate. Then another je which skips over an enumeration mutation exception, a jb which I have no idea about but moves back to the top of the loop and then a jne which I think moves to the top of the loop if there are objects left to enumerate. Interestingly, mutating causes the first branch 0 to take, which solves one mystery, but branch 2 still eludes me – Martin Pilkington Aug 14 '11 at 22:28
  • Hmm.. other possible case is to disassemble the object file, which was compiled with `-pg` (for gcov running). You should see calls to some gcov instrumenting functions in such disassembly.. smth like "__llvm_gcov_ctr" increment or "__llvm_gcda_edge" call. Also, did you compile at `O0` with `-fno-inline`? – osgx Aug 15 '11 at 00:13
  • I've been compiling at O0. I didn't use -fno-inline but I've just tried and it seems to have no effect. There are various __llvm_gcov_ctr101 comments in the disassembly but I have little to no idea what the rest of it means. I suspect it's some intricacy of fast enumeration in Cocoa. Or it could be a bug in LLVM/Clang, given how relatively new the feature is, though I'm more likely to suspect the former in this case – Martin Pilkington Aug 15 '11 at 00:51
  • __llvm_gcov_ctr101 comment is the counter which will be shown by gcov. There must be different counters in the asm text of functions. This is not a bug, but this ObjC construct becomes a lot basic blocks (en.wikipedia.org/wiki/Basic_block) of asm and gcov does count of basic blocks of assembly. – osgx Aug 15 '11 at 09:19
  • Aha, that actually helped a lot, now that I know what all the ## BB and LBB stuff in the generated assembly for my source means. I can match it all up and see that branch 2 is the jne. Still not 100% sure what it does as I'll have to brush up on assembly but I know where it is in the source now. Anyway I know how to understand what the gcov means now, which is what I wanted. If you want to post an answer I'll select it, thanks for your help! :) – Martin Pilkington Aug 15 '11 at 11:19

1 Answers1

4

Gcov works by instrumenting (while compiling) every basic block of machine commands (you can think about assembler). Basic block means a linear section of code, which have no branches inside it and no lables inside it. So, If and only if you start running a basic block, you will reach end of basic block. Basic blocks are organized in CFG (Control flow graph, think about it as directed graph), which shows relations between basicblocks (edge from V1 to V2 is V1 calls V2; and V2 is called by V1). So, profile-arcs mode of compiler and gcov want to get execution count for every line and do this via counting basic block executions. Some of edges in CFG are instrumented and some are not, because there are algebraic relations between basic blocks in graph.

Your ObjC construction (for..in) is lowered (converted in early compilation) to several basic blocks. So, gcov sees 4 branches, because it sees only lowered BBs. It knows nothing about this lowering, but it knows which line corresponds to every assembler instruction (this is debug info). So, branches are edges of CFG.

If you want to see basic blocks, you should do an assembler dump of compiled program or disassemble a binary or dump CFG from compiler. You can do this both for profile-arcs and non-profile-arcs modes and compare them.

profile-arcs mode will have a lot calls and increments of something like "__llvm_gcov_ctr" or "__llvm_gcda_edge" - it is an actual instrumentation of basic blocks.

osgx
  • 90,338
  • 53
  • 357
  • 513