Testing an Algorithms speed. How?

Question

I'm currently testing different algorithms, which determine whether an Integer is a real square or not. During my research I found this question at SOF: Fastest way to determine if an integer's square root is an integer

I'm compareably new to the Programming scene. When testing the different Algorithms that are presented in the question, I found out that this one

bool istQuadratSimple(int64 x)
{
    int32 tst = (int32)sqrt(x);
    return tst*tst == x;
}

actually works faster than the one provided by A. Rex in the Question I posted. I've used an NS-Timer object for this testing, printing my results with an NSLog.

My question now is: How is speed-testing done in a professional way? How can I achieve equivalent results to the ones provided in the question I posted above?

Create a loop that calls the function several thousand times. — Devolus, Sep 17 '13 at 16:39
[So You Want To Write Your Own Benchmark presentation](http://www.slideshare.net/drorbr/so-you-want-to-write-your-own-benchmark-presentation) (Java, but close enough). [Top benchmarking question on StackOverflow](http://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java) (also Java, but contains useful generic tips). — Bernhard Barker, Sep 17 '13 at 16:46
Call the function enough times that it takes at least 1 real-time second to elapse. This will also allow other processes to execute, simulating a real environment. Otherwise, you might not see the effects of cache misses. — Mark Lakata, Sep 17 '13 at 16:49
Given the fact that machines today are able to do millions of computations per second for visualizing a considerable difference between the speed of different algorithms you should run the required piece of code hundreds or probably thousands of time and then take a average of total run time. The only you can see a considerable difference in speed. Hope it helps. — Prateek, Sep 17 '13 at 16:54
related: [Idiomatic way of performance evaluation?](https://stackoverflow.com/q/60291987) re: common pitfalls. — Peter Cordes, Oct 18 '22 at 03:42

score 2 · Accepted Answer · edited May 23 '17 at 11:56

The problem with calling just this function in a loop is that everything will be in the cache (both the data and the instructions). You wouldn't measure anything sensible; I wouldn't do that.

Given how small this function is, I would try to look at the generated assembly code of this function and the other one and I would try to reason based on the assembly code (number of instructions and the cost of the individual instructions, for example).

Unfortunately, it only works in trivial / near trivial cases. For example, if the assembly codes are identical then you know there is no difference, you don't need to measure anything. Or if one code is like the other plus additional instructions; in that case you know that the longer one takes longer to execute. And then there are the not so clear cases... :(
(See the update below.)

You can get the assembly with the -S -emit-llvm flags from clang and with the -S flag from gcc.

Hope this help.

UPDATE: Response to Prateek's question in the comment "is there any way to determine the speed of one particular algorithm?"

Yes, it is possible but it gets horribly complicated REALLY quick. Long story short, ignoring the complexity of modern processors and simply accumulating some predefined cost associated with the instructions can lead to very very inaccurate results (the estimate off by a factor of 100, due to the cache and the pipeline, among others). If you try take into consideration the complexity of the modern processors, the hierarchical cache, the pipeline, etc. things get very difficult. See for example Worst Case Execution Time Prediction.

Unless you are in a clear situation (trivial / near trivial case), for example the generated assembly codes are identical or one is like the other plus a few instructions, it is also hard to compare algorithms based on their generated assembly.

However, here a simple function of two lines is shown, and for that, looking at the assembly could help. Hence my answer.

Yeah I agree that going through the assemble would be the best we can do to get the most optimal result for comparing two algorithm's speed. But I am curious about how that helps in determining the speed of an algorithm. It might be able to compare two given algorithms but is there any way to determine the speed of one particular algorithm? — Prateek, Sep 17 '13 at 17:50
@Prateek If you find the answer (I have slightly updated again, inserted a new link), feel free to upvote it! ;) — Ali, Sep 17 '13 at 21:18
So I checked the assembly code of the Algorithm by A. Rex in the question I posted. It is by far larger than the one I posted. I'm curious: Somehow the people from the question I posted must have found a way to check the algorithms against each other. What I want for the moment is an actual prove for the results they provided. If I have that, I wonder, how they found out that this particular algorithm xy is xy percent faster than the other one. — Georg Zänker, Sep 20 '13 at 16:09
@GeorgZänker Micro-benchmarking can be notoriously difficult to do reliably. To add injury to insult, lots of poorly executed benchmarks / experiments are available. So, just because someone says algorithm X is Y% faster, it is not necessarily true. — Ali, Sep 20 '13 at 18:09

score 1 · Answer 2 · edited Oct 17 '22 at 18:35

I am not sure if there is any professional way of checking the speed (if there is let me know as well). For the method that you directed to in your question I would probably do something this this in java.

package Programs;

import java.math.BigDecimal;
import java.math.RoundingMode;

public class SquareRootInteger {

    public static boolean isPerfectSquare(long n) {
        if (n < 0)
            return false;

        long tst = (long) (Math.sqrt(n) + 0.5);
        return tst * tst == n;
    }

    public static void main(String[] args) {
        long iterator = 1;
        int precision = 10;
        long startTime = System.nanoTime(); //Getting systems time before calling the isPerfectSquare method repeatedly 
        while (iterator < 1000000000) {
            isPerfectSquare(iterator);
            iterator++;
        }
        long endTime = System.nanoTime(); // Getting system time after the 1000000000 executions of isPerfectSquare method
        long duration = endTime - startTime;
        BigDecimal dur = new BigDecimal(duration);
        BigDecimal iter = new BigDecimal(iterator);
        System.out.println("Speed "
                + dur.divide(iter, precision, RoundingMode.HALF_UP).toString()
                + " nano secs"); // Getting average time taken for 1 execution of method.

    }
}

You can check your method in similar fashion and check which one outperforms other.

score 0 · Answer 3 · answered Sep 17 '13 at 17:22

0

Record the time value before your massive calculation and the value after that. The difference is the time executed.
Write a shell script where you will run the program. And run 'time ./xxx.sh' to get it's running time.

answered Sep 17 '13 at 17:22

TwoCode

127
7

Testing an Algorithms speed. How?

3 Answers3