1

I am trying to profile a C program that uses some methods of openssl/libcrypto. Everything work well when I compile and run the code without profiling information. When I add options to profile it with gprof, I get unexpected results from the profiling tool.

I did many researched but I didn't find any page that solved my problem.

This is my code (named test.c):

#include <stdio.h>
#include <openssl/bn.h>
#include <openssl/rand.h>

static BIGNUM *x;
static BIGNUM *y;
static BIGNUM *z;
static BIGNUM *p;
static BN_CTX *tmp;
static unsigned int max_size;

int main(void){

  int max_bytes, res_gen;
  max_bytes  = 50;

  tmp = BN_CTX_new();
  BN_CTX_init(tmp);
  x = BN_new();
  y = BN_new();
  z = BN_new();
  p = BN_new();

  RAND_load_file("/dev/urandom", max_bytes);

 max_size = 256; 
 BN_rand(x, max_size, 0, 0);
 BN_rand(y, max_size, 0, 0);
 res_gen = BN_generate_prime_ex(p, max_size, 0, NULL, NULL, NULL);

 BN_mul(z, x, y, tmp);
 BN_nnmod(x, z, p, tmp);

 printf("\nOk\n");

 BN_free(x);
 BN_free(y);
 BN_free(z);
 BN_free(p);
 BN_CTX_free(tmp); 

 return 0;
}

When I compile with profiling information using gcc -pg -static test.c -lcrypto -ldl, it produces the following results. I get 0% (and 0 second) for everything, which is unexpected.

Flat profile:

Each sample counts as 0.01 seconds.
no time accumulated

%   cumulative   self              self     total           
time   seconds   seconds    calls  Ts/call  Ts/call  name    
0.00      0.00     0.00        1     0.00     0.00  main


                         Call graph

granularity: each sample hit covers 2 byte(s) no time propagated

index % time    self  children    called     name
                0.00    0.00       1/1           __libc_start_main [4282]
[1]      0.0    0.00    0.00       1         main [1]
                0.00    0.00       0/0           mcount (3495)
                0.00    0.00       0/0           BN_CTX_new [275]
                0.00    0.00       0/0           BN_CTX_init [274]
                0.00    0.00       0/0           BN_new [372]
                0.00    0.00       0/0           RAND_load_file [1636]
                0.00    0.00       0/0           BN_rand [386]
                0.00    0.00       0/0           BN_generate_prime_ex [331]
                0.00    0.00       0/0           BN_mul [370]
                0.00    0.00       0/0           BN_nnmod [378]
                0.00    0.00       0/0           puts [3696]
                0.00    0.00       0/0           BN_free [327]
                0.00    0.00       0/0           BN_CTX_free [272]
-----------------------------------------------

Also, it seems that the profiler detects only the main method because details for others methods don't appear in flat profile and call graph.

So, I would like to know if I must compile OpenSSL library with some options (what options ?) or something else.

jww
  • 97,681
  • 90
  • 411
  • 885
Yssouf
  • 11
  • 3
  • @jww: Thanks for the modifications. I didn't know how to add this image directly. The first command error is what leads me to second. Also, I added a second image for the second command results. – Yssouf Apr 06 '16 at 03:07
  • @jww: text added (I hope in the right manner) – Yssouf Apr 06 '16 at 10:06
  • Yes, it looks fine, thanks. And it tests well too: [Google search BN_CTX_new gprof](https://www.google.com/search?q=BN_CTX_new+grpof). Your question is in the top result, which means its indexed for future visitors. – jww Apr 06 '16 at 16:22
  • Something else you can do if you want to know how many times a function is called is to use `-coverage` option. It does not show CPU time, but it does show the count of each function called. I use it for other projects to ensure code has test cases that exercise a particular code path. It might show you a red herring in your code. For example, `BN_new` and `BN_exp` have different time costs (allocation is less expensive than exponentiation), so you will need to have an idea before hand (which appears to be what you are trying to discover). – jww Apr 06 '16 at 16:32
  • Also see [Weird profiling output using gprof](http://stackoverflow.com/q/21563303), [How to compare several Gprof profiler reports](http://stackoverflow.com/q/21204407) and [Why does gprof significantly underestimate the program's running time?](http://stackoverflow.com/q/4309831) – jww Apr 06 '16 at 16:41

1 Answers1

0

gprof is a CPU-profiler. That means it cannot account for any time spent in blocking system calls like I/O, sleep, locks, etc.

Assuming the goal is to find speedups, what a lot of people do is get it running under a debugger like GDB and take stack samples manually.

It's not measuring (with any precision). It's pinpointing problems. Anything you see it doing that you could avoid, if you see it on more than one sample, will give you a substantial speedup. The fewer samples you have to take before seeing it twice, the bigger the speedup.

Community
  • 1
  • 1
Mike Dunlavey
  • 40,059
  • 14
  • 91
  • 135