Putting a number to the efficiency of an algorithm

Question

I have been developing with Java for some time now, and always strife to do something in the most efficient way. By now i have mostly been trying to condense the number of lines of code I have. But when starting to work with 2d rendering it is more about how long it takes to compute a certain piece of code as it is called many times a second.

My question:

Is there some way to measure how long it takes to compute a certain piece of code in Eclipse, Java, ... ?

The correlation between speed and lines of code is at best spurious. You will need to learn some computer science in order to carry out any useful analysis of algorithms. Otherwise you will have to go down the empirical route. Either way, the subject area is far too broad. — Boris the Spider, Jan 03 '15 at 01:06

Stephen C · Answer 1 · 2015-01-04T00:49:51.780

First, some nitpicking. You title this question ...

Putting a number to the efficiency of an algorithm

There is no practical quantifiable measure of "efficiency" for an algorithm. Efficiency (as normally conceived) is a measure of "something" relative to an ideal / perfect; e.g. a hypothetical 100% efficient steam engine would convert all of the energy in the coal being burned into useful "work". But for software, there is no ideal to measure against. (Or if there is, we can't be sure that it is the ideal.) Hence "efficiency" is the wrong term.

What you actually mean is a measure of the "performance" of ...
Algorithms are an abstract concept, and their performance cannot be measured.

What you actually want is a measure of the performance of a specific implementation of an algorithm; i.e. some actual code.

So how do you quantify performance?

Well, ultimately there is only one sound way to quantify performance. You measure it, empirically. (How you do that ... and the limitations ... are a matter I will come to.)

But what about theoretical approaches?

A common theoretical approach is to analyse the algorithm to give you a measure of computational complexity. The classic measure is Big-O Complexity. This is a very useful measure, but unfortunately Big-O Complexity does not actually measure performance at all. Rather, it is a way of characterizing the behaviour of an algorithm as the problem size scales up.

To illustrate, consider these algorithms for adding B numbers together:

    int sum(int[] input) {
        int sum = 0;
        for (int i = 0; i < input.size(); i++) {
            sum += input[i];
        }
        return i;
    }

    int sum(int[] input) {
        int tmp = p(1000);  // calculates the 1000th prime number
        int sum = 0;
        for (int i = 0; i < input.size(); i++) {
            sum += input[i];
        }
        return i;
    }

We can prove that both versions of sum have a complexity of O(N), according to the accepted mathematical definitions. Yet it obvious that the first one will be faster than the second one ... because the second one does a large (and pointless) calculation as well.

In short: Big-O Complexity is NOT a measure of Performance.

What about theoretical measures of Performance?

Well, as far as I'm aware, there are none that really work. The problem is that real performance (as in time taken to complete) depends on various complicated things in the compilation of code to executables AND the way that real execution platforms (hardware) behaves. It is too complicated to do a theoretical analysis that will reliably predict actual performance.

So how do you measure performance?

The naive answer is to benchmark like this:

Take a clock measurement
Run the code
Take a second clock measurement
Subtract the first measurement from the second ... and that is your answer.

But it doesn't work. Or more precisely, the answer you get may be wildly different from the performance that the code exhibits when you use it in a real world context.

Why?

There may be other things happening on the machine that are happening ... or have happened ... that influence the code's execution time. Another program might be running. You may have files pre-loaded into the file system cache. You may get hit by CPU clock scaling ... or a burst of network traffic.
Compilers and compiler flags can often make a lot of difference to how fast a piece of code runs.
The choice of inputs can often make a big difference.
If the compiler is smart, it might deduce that some or all of your benchmarked code does nothing "useful" (in the context) ... and optimize it away entirely.

And for languages like Java and C#, there are other important issues:

Implementations of these languages typically do a lot of work during startup to load and link the code.
Implementations of these languages are typically JIT compiled. This means that the language runtime system does the final translation of the code (e.g. bytecodes) to native code at runtime. The performance characteristics of your code after JIT compilation change drastically, and the time taken to do the compilation may be significant ... and may distort your time measurements.
Implementations of these languages typically rely on a garbage collected heap for memory management. The performance of a heap is often uneven, especially at startup.

These things (and possibly others) contribute to something that we call (in Java) JVM warmup overheads; particularly JIT compilation. If you don't take account of these overheads in your methodology, then your results are liable to be distorted.

So what is the RIGHT way to measure performance of Java code?

It is complicated, but the general principle is to run the benchmark code lots of times in the same JVM instance, measuring each iteration. The first few measurements (during JVM warmup) should be discarded, and the remaining measurements should be averaged.

These days, the recommended way to do Java benchmarking is to use a reputable benchmarking framework. The two prime candidates are Caliper and Oracle's jmh tool.

And what are the limitations of performance measurements that you mentioned?

Well I have alluded to them above.

Performance measurements can be distorted to various environmental factors on the execution platform.
Performance can be dependent on inputs ... an this may not be revealed by simple measurement.
Performance (e.g. of C / C++ code) can be dependent on the compiler and compiler switches.
Performance can be dependent on hardware; e.g. processors speed, number of cores, memory architecture, and so on.

These factors can make it difficult to make general statements about the performance of a specific piece of code, and to make general comparisons between alternative versions of the same code. As a rule, we can only make limited statements like "on system X, with compiler Y and input set Z the performance measures are P, Q, R".

Massive, correct, answer amongst a sea of _use `System.currentTimeMillis`_ and no votes. This site baffles me sometimes! — Boris the Spider, Jan 03 '15 at 11:01

score 2 · Accepted Answer · answered Jan 03 '15 at 02:20

The amount of lines has very little correlation to the execution speed of a program.

Your program will look completely different after it's processed by the compiler. In general, large compilers perform many optimizations, such as loop unrolling, getting rid of variables that are not used, getting rid of dead code, and hundreds more.

So instead of trying to "squeeze" the last bit of performance/memory out of your program by using short instead of int, char[] instead of String or whichever method you think will "optimize" (premature optimization) your program, just do it using objects, or types such that make sense to you, so it will be easier to maintain. Your compiler, interpreter, VM should take care of the rest. If it doesn't, only then do you start looking for bottlenecks, and start playing with hacks.

So what makes programs fast then? Algorithmic efficiency (at least it tends to make the biggest difference if the algorithm/data structure was not designed right). This is what computer scientists study.

Let's say you're given 2 data structures. An array, and a singly linked list.

An array stores things in a block, one after the other.

+-+-+-+-+-+-+-+
|1|3|2|7|4|6|1|
+-+-+-+-+-+-+-+

To retrieve the element at index 3, you simply just go to the 4th square and retrieve it. You know where it is because you know it's 3 after the first square.

A singly linked list will store things in a node, which may not be stored contiguously in memory, but each node will have a tag (pointer, reference) on it telling you where the next item in the list is.

+-+    +-+    +-+    +-+    +-+    +-+    +-+
|1| -> |3| -> |2| -> |7| -> |4| -> |6| -> |1|
+-+    +-+    +-+    +-+    +-+    +-+    +-+

To retrieve the element at index of 3, you will have to start with the first node, then go to the connected node, which is 1, and then go to 2, and finally after, you arrive at 3. All because you don't know where they are, so you follow a path to them.

Now say you have an Array and an SLL, both containing the same data, with the length n, which one would be faster? Depends on how you use it.

Let's say you do a lot of insertions at the end of the list. The algorithms (pseudocode) would be:

Array:
array[list.length] = element to add
increment length field

SLL:
currentNode = first element of SLL

while currentNode has next node:
    currentNode = currentNode's next element

currentNode's next element = new Node(element to add)
increment length field

As you can see, in the array algorithm, it doesn't matter what the size of the array is. It always takes a constant amount of operations. Let's say a[list.length] takes 1 operation. Assigning it is another operation, incrementing the field, and writing it to memory is 2 operations. It would take 4 operations every time. But if you look at the SLL algorithm, it would take at least list.length number of operations just to find the last element in the list. In other words, the time it takes to add an element to the end of an SLL increases linearly as the size of the SLL increases t(n) = n, whereas for the array, it's more like t(n) = 4.

I suggest reading the free book written by my data structures professor. Even has working code in C++ and Java

score 0 · Answer 3 · answered Jan 03 '15 at 02:06

Generally speaking, the speed vs. lines of code is not the most effective measure of performance since it depends heavily depends on your hardware and your compiler. There is something called Big Oh notation, which gives one a picture of how fast an algorithm will run as the number of inputs increase.

For example, if your algorithm speed is O(n), then the time it will take for code to run scales linear with time. If your algorithm speed is O(1), then the time it will take your code to run will be constant.

I found this particular way of measuring performance useful because you learn that it's not really lines of code that will effect speed it's your codes design that will effect speed. A code with a more efficient way of handling the problem can be faster than code with a less efficient method with 1/10 lines of code.

Computational complexity does not measure absolute or relative performance. It simply characterises the way that performance changes as the problem size gets very large. — Stephen C, Jan 05 '15 at 22:39

Putting a number to the efficiency of an algorithm

3 Answers3

So how do you quantify performance?

So how do you measure performance?

So what is the RIGHT way to measure performance of Java code?

And what are the limitations of performance measurements that you mentioned?