I've done some SO searching and found this and that outlining timing methods.
My problem is that I need to determine the CPU time (in milliseconds) required to execute the following loop:
for (int i = 0, temp = 0; i < 10000; i++)
{
if (i % 2 == 0)
{
temp = (i / 2) + 1;
}
else
{
temp = 2 * i;
}
}
I've looked at two methods, clock()
and stead_clock::now()
. Per the docs, I know that clock()
returns "ticks" so I can get it in seconds by dividing the difference using CLOCKS_PER_SEC
. The docs also mention that steady_clock
is designed for interval timing, but you have to call duration_cast<milliseconds>
to change its unit.
What I've done to time the two (since doing both in the same run may lead to one taking longer since the other was called first) is run them each by themselves:
clock_t t = clock();
for (int i = 0, temp = 0; i < 10000; i++)
{
if (i % 2 == 0)
{
temp = (i / 2) + 1;
}
else
{
temp = 2 * i;
}
}
t = clock() - t;
cout << (float(t)/CLOCKS_PER_SEC) * 1000 << "ms taken" << endl;
chrono::steady_clock::time_point p1 = chrono::steady_clock::now();
for (int i = 0, temp = 0; i < 10000; i++)
{
if (i % 2 == 0)
{
temp = (i / 2) + 1;
}
else
{
temp = 2 * i;
}
}
chrono::steady_clock::time_point p2 = chrono::steady_clock::now();
cout << chrono::duration_cast<milliseconds>(p2-p1).count() << "ms taken" << endl;
Output:
0ms taken
0ms taken
Do both these methods floor the result? Surely some fractal of milliseconds took place?
So which is ideal (or rather, more appropriate) for determining the CPU time required to execute the loop? At first glance, I would argue for clock()
since the docs specifically tell me that its for determining CPU time.
For context, my CLOCKS_PER_SEC
holds a value of 1000
.
Edit/Update:
Tried the following:
clock_t t = clock();
for (int j = 0; j < 1000000; j++) {
volatile int temp = 0;
for (int i = 0; i < 10000; i++)
{
if (i % 2 == 0)
{
temp = (i / 2) + 1;
}
else
{
temp = 2 * i;
}
}
}
t = clock() - t;
cout << (float(t) * 1000.0f / CLOCKS_PER_SEC / 1000000.0f) << "ms taken" << endl;
Outputs: 0.019953ms taken
clock_t start = clock();
volatile int temp = 0;
for (int i = 0; i < 10000; i++)
{
if (i % 2 == 0)
{
temp = (i / 2) + 1;
}
else
{
temp = 2 * i;
}
}
clock_t end = clock();
cout << fixed << setprecision(2) << 1000.0 * (end - start) / CLOCKS_PER_SEC << "ms taken" << endl;
Outputs: 0.00ms taken
chrono::high_resolution_clock::time_point p1 = chrono::high_resolution_clock::now();
volatile int temp = 0;
for (int i = 0; i < 10000; i++)
{
if (i % 2 == 0)
{
temp = (i / 2) + 1;
}
else
{
temp = 2 * i;
}
}
chrono::high_resolution_clock::time_point p2 = chrono::high_resolution_clock::now();
cout << (chrono::duration_cast<chrono::microseconds>(p2 - p1).count()) / 1000.0 << "ms taken" << endl;
Outputs: 0.072ms taken
chrono::steady_clock::time_point p1 = chrono::steady_clock::now();
volatile int temp = 0;
for (int i = 0; i < 10000; i++)
{
if (i % 2 == 0)
{
temp = (i / 2) + 1;
}
else
{
temp = 2 * i;
}
}
chrono::steady_clock::time_point p2 = chrono::steady_clock::now();
cout << (chrono::duration_cast<chrono::microseconds>(p2 - p1).count()) / 1000.0f << "ms taken" << endl;
Outputs: 0.044ms
So the question becomes, which is valid? The second method to me seems invalid because I think the loop is completing faster than a millisecond.
I understand the first method (simply to execute longer) but the last two methods produce drastically different results.
One thing I've noticed is that after compiling the program, the first time I run it I may get 0.073ms (for the high_resolution_clock
) and 0.044ms (for the steady_clock
) at first, but all subsequent runs are within the range of 0.019 - 0.025ms.