Force GCC not to rearrange some part of the code

Question

I have a program that is very time critical and I want to measure the time it takes to execute the program. Here is a simple schematic of the code.

clock1=current_time();
instruction 1;
instruction 2;
.
.
.
.
instruction n;
clock2=current_time();
total time = clock2-clock1;

When I am using GCC to compile the above code with -O2 or -O3 option it always moves clock2=current_time() in the middle of the instructions. Hence giving wrong results. Is there any way to restrict gcc not to rearrange this part of the code while optimizing all other parts of the code? Some requirements,The routine to measure time is unchangeable. I have to use the routine provided. Thank you for your help. Regards.

Update : with breakpoints

clock1=0;clock2=0;
clock1=current_time();
breakpoint 1 : instruction 1; // clock1=<value> clock2=0
instruction 2;
.
.
.
.
breakpoint 2 : instruction n;// clock1=<value> clock2=<value>
clock2=current_time();
total time = clock2-clock1;
breakpoint 3:// clock1=<value> clock2=<value> total time=clock2-clock1;

Read [this](http://preshing.com/20120625/memory-ordering-at-compile-time/). — cadaniluk, Dec 13 '16 at 13:22
Use compiler and memory barriers. But note that this way of profiling is quite imprecise (depending on target architecture). — too honest for this site, Dec 13 '16 at 13:25
@Downvoter: Better use standard ways provided by `stdatomic.h`. — too honest for this site, Dec 13 '16 at 13:27
@Downvoter I just tried the method. It does not work for -O2. — Rick, Dec 13 '16 at 13:31
A poor man's memory barrier is to declare the variables as `volatile`. Did you try this? Ultimately it sounds like a bug inside `current_time`, it should be using volatile variables. — Lundin, Dec 13 '16 at 13:34
@user2764478: Did you read the standard or otherwise search? What don't you understand which was not already asked? Note: atomics and barriers is nothing we can explain within the limits of site-rules. Less you can comprehend in few minutes if you are not familiar with the concepts. — too honest for this site, Dec 13 '16 at 14:13
@Lundin: You should know better to even mention 'volatile' here. Additionally, `volatile` does not prevent non-volatile accesses from being reordered, And making all objects `volatile` will most likely influence the result. — too honest for this site, Dec 13 '16 at 14:16
@Olaf Hence "poor man's memory barrier". You don't want the actual benchmarking to get optimized away. If the benchmarking function is broken and doesn't use some means of optimizer protection internally, then there's no other way to save it. In that case, then `a=clock(); b=clock() - a;` could get optimized to `a=clock(); b=0;`. — Lundin, Dec 13 '16 at 15:30
@Lundin: I'd call compiler barriers "poor man's barrier", as they ignore hardware-reordering, but they at least keep the compiler from cross-border reordering. `volatile`does not even do that. Maybe I just missunderstand the phrase "poor man's". If that is actually an euphemism of "idio?'s barrier" or "does not work at all", I agree about that. Or maybe you talk about `current_time` only; here `volatile is of course a necessary requirement. But it will not suffice. — too honest for this site, Dec 13 '16 at 15:39
Hi All, Thank you all for your answers. I used compiler barriers as suggested by people. Even after using compiler barriers my running time is lower than the expected. Hence, used gdb to see the execution flow. I have added another code in the question. I put a breakpoint in instruction 1, instruction n and another after total time update instruction. During debugging I saw after breakpoint 1 clock1=, clock2=0, after breakpoint 2 clock1=, clock2=0 and after 3rd breakpoint cclock1=, clock2= & time=clock2-clock1. Does it mean the current_time is executing properly? — Rick, Dec 14 '16 at 10:32

Force GCC not to rearrange some part of the code

0 Answers0