Measuring Time Spent Executing Function

Question

edit: I've figured it out. The 2nd algorithm was running so efficiently that the time wasn't even being recorded with an input < 100,000

I'm trying to measure how long a certain algorithm that I have implemented in a function takes to execute. I've included <time.h> and I'm surrounding the function around time_t variables. It works great for my first implementation, but not my second.

Do I need to close the clock stream (can't think of a better work)between uses? Kind of like how you close the Scanner stream in a Java program. Here's my code in case I am not explaining it well.

switch(choice) {
    case 1:
        printf("Beginning prefixAverages1\n");
        clock_t begin1 = clock();
        int *a1 = prefixAverages1(input);
        clock_t end1 = clock();
        double time_spent1 = (double)(end1 - begin1) * 1000.0 / CLOCKS_PER_SEC;
        free(a1);
        printf("Algorithm took %f milliseconds to execute \n", time_spent1);
        break;
    case 2:
        printf("Beginning prefixAverages2\n");
        clock_t begin2 = clock();
        int *a2 = prefixAverages2(input);
        clock_t end2 = clock();
        double time_spent2 = (double)(end2 - begin2) * 1000.0 / CLOCKS_PER_SEC;
        free(a2);
        printf("Algorithm took %f milliseconds to execute \n", time_spent2);
        break;
    default:
        printf("Invalid input!");
        break;
}

Time is displayed properly in my 1st case, but not in the second. I've tried doing some research but I can't find anything particular to my scenario.

When running case 1, depending on the input, I get a time between 600-1000 ms to run (which sounds about right). When I run case 2, regardless of the input, I get 00.000

Here are my functions if that helps:

int* prefixAverages1(int input) {
    int x[input];
    int *a = malloc(input*sizeof(*a));
    srand(time(NULL));  

    for(int i = 0; i < input; i++) {
        int sum = 0;
        for(int j = 0; j < i; j++) {
            int r = rand() % 100;
            x[j] = r;
            sum = sum + x[j];
        }
        a[i] = sum / (i+1);
    }
    return a;
}

int* prefixAverages2(int input) {
    int sum = 0;
    int x[input];
    int *a = malloc(input*sizeof(*a));
    srand(time(NULL));  

    for(int i = 0; i < input; i++) {
        int r = rand() % 100;
        x[i] = r;
        sum = sum + x[i];
        a[i] = sum / (i+1);
    }
    return a;
}

How do you know the time is improper in the second case? What do you get from the first and second? How long do the functions actually take to run? If they're under a number in the 50 ms range, the timing is likely to be unreliable, IMO — and longer than that would be better. But you really need to show what it is that you're getting. — Jonathan Leffler, May 23 '18 at 00:18
@JonathanLeffler You're right. Sorry about that. I've edited the post. — InfoSecNick, May 23 '18 at 00:21
Looks like the measurements should be accurate. Have you tried executing the 2nd choice and THEN the first? https://en.wikipedia.org/wiki/CPU_cache#CPU_stalls. Although 600ms worth of memory time is far more than a modern CPU can cache as far as I know (generally speaking) — zzxyz, May 23 '18 at 00:32
Add a sleep between `begin2` and `end2` - that will tell you if timing is a problem or the 2nd algorithm is really fast... — John3136, May 23 '18 at 00:34
@zzxyz Yes, I've tried running the 2nd choice first. Hmm, it's strange I'm not even getting 0.001 ms. — InfoSecNick, May 23 '18 at 00:40
Print the values of the bedinners and end. Check if your second function does not return instantly — 0___________, May 23 '18 at 00:42
@John3136 I've added a sleep statement, and I'm still getting 0.000 ms — InfoSecNick, May 23 '18 at 00:43
@Schwern Do you mean optimized to the point where it runs instantly? Maybe. But if I put a very large integer, it should still take a millisecond to run at least. — InfoSecNick, May 23 '18 at 00:45
@InfoSecNick I mean optimized to the point where the compiler eliminates the call to `prefixAverages2` completely. Since the code never uses the return value (except to free it), if `prefixAverages2` has no side-effects the compiler might eliminate the call completely. Try compiling with and without optimizations. Or do something with the contents of `a2`, like printing them. Maybe you should show us `prefixAverages2`. — Schwern, May 23 '18 at 00:48
@Schwern That's interesting. Didn't know that was something a compiler could do. All of the work for `a2` is done within the function call. It creates an array of size n, and averages the values for the array. The first function does the same, just with a different algorithm. I can print the values, but I'm looking to measure the execution-time differences between the two algorithms. — InfoSecNick, May 23 '18 at 00:54
@InfoSecNick Assuming your sleep was big enough (make it 1000) then one possibility would be stack corruption in the 2nd algorithm. — John3136, May 23 '18 at 00:55
@InfoSecNick I can't speak for the sleep(), but the 2nd function is obviously going to be thousands of times faster than the first once your input values get into the > 2000 range. — zzxyz, May 23 '18 at 01:03
@user3386109 It is user input. I'm trying values between 10-50000. Same answer everytime. — InfoSecNick, May 23 '18 at 01:07
@zzxyz Yes, that's a given. I was hoping I'd be able to get at least a value to compare with. — InfoSecNick, May 23 '18 at 01:08
Try printing some of the values returned outside the timing code and see whether that makes enough difference. The first function appears to be O(N*N) and the second O(N) so you’d expect the second to be quicker— a lot quicker in big enough values of N. — Jonathan Leffler, May 23 '18 at 02:24
@InfoSecNick Thanks. I'm still getting valid times. My best guess is that `clock` failed for some reason. [See the update in my answer](https://stackoverflow.com/a/50478277/14660) for how to check for that. The other possibility is that `begin` and `end` are so close together that `(double)(end - begin) * 1000.0 / CLOCKS_PER_SEC;` is resulting in 0 because of floating point inaccuracy. Try printing `begin` and `end` alone. — Schwern, May 23 '18 at 05:50

Schwern · Accepted Answer · 2018-05-23T05:46:08.467

While I don't know why the second choice might be 0, since the two functions have the same signature you can eliminate the redundant code by using function pointers.

void do_benchmark( const char *name, int*(*functionPtr)(int), int input ) {
    printf("Beginning %s\n", name);
    clock_t begin = clock();
    int *ret = (*functionPtr)(input);
    clock_t end = clock();
    double time_spent = (double)(end - begin) * 1000.0 / CLOCKS_PER_SEC;
    free(ret);
    printf("Algorithm took %f milliseconds to execute \n", time_spent);
}

Then both functions are run with the same timing code, eliminating differences in the benchmarking code as the culprit.

switch(choice) {
    case 1:
        do_benchmark("prefixAverages1", &prefixAverages1, input);
        break;
    case 2:
        do_benchmark("prefixAverages2", &prefixAverages2, input);
        break;
    default:
        printf("Invalid input!");
        break;
}

Note that clock can fail.

If the processor time used is not available or its value cannot be represented, the function returns the value (clock_t)(-1).

You'll want to check for that failure.

if( begin == (clock_t)-1 ) {
    fprintf(stderr, "Begin time not available.\n");
}
else if( end == (clock_t)-1 ) {
    fprintf(stderr, "End time not available.\n");
}
else {
    double time_spent = (double)(end - begin) * 1000.0 / CLOCKS_PER_SEC;
    printf("Algorithm took %f milliseconds to execute \n", time_spent);
}

@InfoSecNick I'm curious to know what is was. – Schwern May 29 '18 at 19:27 — Schwern, May 29 '18 at 19:27

zzxyz · Answer 2 · 2018-05-23T02:19:33.893

I'm more familiar with doing this on Windows using QueryPerformanceCounter, so I'm probably doing all sorts of bad stuff here, but the basic idea for measuring short loops:

int main()
{
  printf("Beginning prefixAverages2\n");
  timespec begin, end;
  clock_gettime(CLOCK_REALTIME, &begin);
  int *a1 = prefixAverages2(50000);
  clock_gettime(CLOCK_REALTIME, &end);
  double time_spent = (end.tv_nsec - begin.tv_nsec) / 1000;
  time_spent += (end.tv_sec - begin.tv_sec) *1000000;

  free(a1);
  printf ("Time spent %f microseconds", time_spent);
}

Output:

Beginning prefixAverages2
Time spent 427.000000 microseconds

PS--It turns out clock() doesn't do wall time: Calculate execution time when sleep() is used

Measuring Time Spent Executing Function

2 Answers2