4

I'm writing some code in which speed is very important. I'm just moving to making the main binary after writing the test cases. For my test runner, I just offer everything to the linker with wildcards. (As below)

In my mind, linking is the stage where C++ glues things together - fills references to functions etc and puts in all together for the binary.

# Do the linking for the test binary
$(BIN)test_cases: $(TEST)TestRunner.o
    $(CC) $(TEST)*.o $(SRC)*.o $(CPPUNITLINKS) $(MAINLINKS) -o $(BIN)test_cases

My question is, given that I am looking to speed up my program in any possible way, would I be better to link the bare minimum files required for the 'main' binary? Will this result in a leaner executable or faster program, or does the compiler already discard anything it doesn't need effectively?

Alexander Shukaev
  • 16,674
  • 8
  • 70
  • 85
joeButler
  • 1,643
  • 1
  • 20
  • 41
  • I don't think linking all the obj file will impact the performance of the exe – Matt Oct 16 '13 at 20:05
  • Should be simple to try an experiment. Write "hello, world", link with a bunch of unneeded .objs and see how big the executable is. Repeat without the .objs and see how big that executable is. – user888379 Oct 16 '13 at 20:08
  • @user888379: Since (in modern OS's) the whole executable isn't loaded into memory - only parts that are actually used - there is no loss of speed from having a large executable. – Mats Petersson Oct 16 '13 at 20:11
  • Sure, I can test this, I may possibly link explicitly for the main binary and see how this affects it. My program is difficult to speed test however as there are many random aspects to it. I guess I'm after learning a bit more about the linking process to feel like I'm doing things right! – joeButler Oct 16 '13 at 20:11

3 Answers3

3

When you link object files into your program the linker will resolve any unresolved symbols in your program. If you want to eliminate dead code (what is not done by GCC by default), you could do the following:

  1. Build object files with -fdata-sections and -ffunction-sections flags (refer to GCC manual for more information);
  2. Link object files using the -Wl,--gc-sections optimization flag which tells the linker to discard unreferenced sections.

NOTE: Only unused static functions are stripped out automatically.

Theoretically presence of redundant symbols only affects the size of the resulting program. However, I've stumbled across the posts where people reported from 1% to 2% performance improvement after stripping the dead code. Of course the code base has to be of substantial size to notice such an effect.

Beware that sometimes this approach may not work properly. For instance, I've experienced crashes or linkage problems on some systems, probably due to bugs in the implementation of this feature.

Furthermore, don't think that these flags improve performance and/or size in every case. There are good reasons why this feature is turned on via flags and not present by default. In fact, sometimes linker may create larger object and executable files and/or slower code, not to mention that you will definitely experience problems with debugging too.

To conclude, be very cautious when using this feature, and always profile your code before and after as recommended in other answers.

Finally, if you are really after speed, you can check my other answer on some useful GCC optimization flags.

Last but not least, the so-called Link Time Optimization (LTO) is a huge new and promising concept that has been introduced into GCC, and recently became more or less stable to use. The respective flag is -lto, see here and here for more information. Although nowadays it is usable, not everything is shiny on some platforms yet. For instance, on Windows the GCC ports MinGW/MinGW-w64 are still struggling to make LTO support of production quality.

Community
  • 1
  • 1
Alexander Shukaev
  • 16,674
  • 8
  • 70
  • 85
1

The number of files used to make the binary will have a small role in the execution time, as long as the file isn't absolutely huge. Of course, if the time it takes to BUILD your program is what you are trying to improve, then trimming the number of object files may be one step towards achieving a quicker build time.

The time it takes to execute a program is very much based on the code that actually is executed. If you have 0, 1, 3, 5, 20, 100, or 10000 functions that aren't being called will not make a measurable difference.

The key to understanding why your code runs slowly (if indeed it is running slowly - it may simply be that it takes that long to perform the work you have asked for) is to use a tool called a profiler. There are plenty of options for profilers, they all do largely the same thing. At the very basic level, a profiler will tell you how much time is spent in each function, and that in turn will tell you where to focus your effort. An instruction profiler will then allow you to drill down to individual instructions to see what the compiler has done and where the time is spent WITHIN the function.

Mats Petersson
  • 126,704
  • 14
  • 140
  • 227
1

The first step in speeding up any program is to PROFILE. A good profile will show the amount of time used by every function in your program.

With your data from profiling, find the functions that are called the most or where the most time is spent. These are the functions that you should concentrate on optimizing.

When optimizing, optimize by requirements (e.g. remove some), then by design (choose different algorithms, remove functions), then by coding (rewrite code that is more efficient, such as reducing branches and jumps) and finally by using platform specific code (specialized assembly instructions). A short cut to the coding optimizations is to tell the compiler to use the maximum optimization for speed.

If you are going to use Dynamic or Shared libraries, place frequently used functions into the same library. This allows the OS to only load in fewer libraries as needed.

Thomas Matthews
  • 56,849
  • 17
  • 98
  • 154