0

I need to run a very computation-intensive program. Therefore the performance is my first concern. There are seems a lot of details of CPU should be taken into account. Like hardware pre-fetch, cache, branch predict,pipeline, and so on. How can I get thorough information on this topic?

thanks.

iqapple
  • 75
  • 8
  • http://www.azillionmonkeys.com/qed/optimize.html – karthikr May 29 '13 at 02:05
  • This is an extremely broad question. First write your program to be *correct*, then *profile* it to measure its performance and optimize it accordingly. – Adam Rosenfield May 29 '13 at 02:06
  • 1
    The same way people get thorough information on any topic these days ... do a web search for articles and books on the subject. – Jim Balter May 29 '13 at 02:11
  • @AdamRosenfield Yes, my program is running good. But I don't know whether my program could cause cache missing or branch predict failing things like those. – iqapple May 29 '13 at 02:11
  • First you need to turn your elephant into a race horse by [*this method*](http://stackoverflow.com/a/378024/23771). Only then, worry about that hardware-level stuff. – Mike Dunlavey May 29 '13 at 02:14

4 Answers4

4

The first, and most important thing to learn is this: DON'T TRY TO GUESS why your program might be slow. Get it working, then TEST IT to find out. Sure, there are certain things likely to be a problem, but real code on real data sets will often surprise you. You really can't know ahead of time where the bottlenecks will be, so learn to use a profiling tool like Valgrind to measure your actual code and go from there.

Lee Daniel Crocker
  • 12,927
  • 3
  • 29
  • 55
0

I do understand your concern in creating an application which might need to address the situations of a typical computer architecture. Prior to this, I would advise you to go through a book on Computer Architecture so that you could get a complete picture of whats' and hows' on touching on an application prior to.

Let me suggest you a book likewise >> http://www.amazon.com/Computer-Organization-Architecture-Stallings-Communications/dp/013293633X

Hope I helped.

0

You should know computer programming and you should know it well. Write clean code, profile it, find bottlenecks and update the code. Most likely the compiler will give you plenty of optimization options already, read its documentation. And I believe compilers nowadays do much better job than most of the programmers.

When you get all this and you are still missing performance, you can read the following:

MartinTeeVarga
  • 10,478
  • 12
  • 61
  • 98
0

A good optimizing compiler will already know all of the details of " hardware pre-fetch, cache, branch predict,pipeline" etc. You will need to tell the compiler what specific CPU you are targeting. For gcc use the -march and -mtune options as a starting point.

Experiment with different compilers like clang and the Intel C compiler.

Profile your program with various input data, and identify where the bottlenecks are, then look into how to write faster code for the bottlenecks. There's almost always more to be gained by using a smarter algorithm than there is in tweaking the assembly code for a particular bottleneck.

markgz
  • 6,054
  • 1
  • 19
  • 41