I have a native C++ application which performs heavy calculations and consumes a lot of memory. My goal is to optimize it, mainly reduce its run time.
After several cycles of profiling-optimizing, I tried the Profile Guided Optimization which I never tried before.
I followed the steps described on MSDN Profile-Guided Optimizations, changing the compilation (/GL
) and linking (/LTCG
) flags. After adding /GENPROFILE
, I ran the application to create .pgc
and .pdg
files, then changed the linker options to /USEPROFILE
and watched additional linker messages that reported that the profiling data was used:
3> 0 of 0 ( 0.0%) original invalid call sites were matched.
3> 0 new call sites were added.
3> 116 of 27096 ( 0.43%) profiled functions will be compiled for speed, and the rest of the functions will be compiled for size
3> 63583 of 345025 inline instances were from dead/cold paths
3> 27096 of 27096 functions (100.0%) were optimized using profile data
3> 608324578581 of 608324578581 instructions (100.0%) were optimized using profile data
3> Finished generating code
Everything looked promising, until I measured the program's performance.
The results were absolutely counterintuitive for me
Performance went down instead of up! 4% to 5% slower than without using Profile Guided Optimization (when comparing with/without the
/USEPROFILE
option).Even when running the exact same scenario that was used with
/GENPROFILE
to create the Profile Guided Optimization data files, it ran 4% slower.
What is going on?