I am trying to measure FLOPS in rather simple way:
clock_gettime(CLOCK_REALTIME, &start);
num1 + num2;
clock_gettime(CLOCK_REALTIME, &end);
ns += end.tv_nsec - start.tv_nsec;
I run this in a loop and then compute how many nanoseconds on average does it take to do this operation.
I am obtaining results that I was not expecting based on the published performance numbers of my CPU.
After further reading my guess is that I am erroneously equating the C statement of adding two floating point numbers with a Floating Point Operation.
My question is: How exactly is a FLOP measured? Is it purely based properties of the CPU such as their frequency?
My complete code:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <unistd.h>
int main(int argc, char *argv[])
{
if (argc != 2) return -1;
int n = atoi(argv[1]);
if (n <= 0) return -1;
float num1;
float num2;
struct timespec start, end, res;
float ns = 0;
clock_getres(CLOCK_REALTIME, &res);
fprintf(stderr, "CLOCK resolution: %ld nanosecond(s).\n", res.tv_nsec);
for (int i = 0; i < n; i++) {
num1 = ((float)rand()/(float)(RAND_MAX));
num2 = ((float)rand()/(float)(RAND_MAX));
clock_gettime(CLOCK_REALTIME, &start);
num1 + num2;
clock_gettime(CLOCK_REALTIME, &end);
ns += end.tv_nsec - start.tv_nsec;
}
fprintf(stderr, "Average time per operation: %.4f\n", ns/n);
}