6

When discussing computational complexity, it seems everyone generally goes straight to Big O.

Lets say for example I have a hybrid algorithm such as merge sort which uses insertion sort for smaller subarrays (I believe this is called tiled merge sort). It's still ultimately merge sort with O(n log n), but I want to discuss the behaviour/characteristics of the algorithm for small n, in cases where no merging actually takes place.

For all intents and purposes the tiled merge sort is insertion sort, executing exactly the same instructions for the domain of my small n. However, Big O deals with the large and asymptotic cases and discussing Big O for small n is pretty much an oxymoron. People have yelled at me for even thinking the words "behaves like an O(n^2) algorithm in such cases". What is the correct way to describe the algorithm's behaviour in cases of small n within the context of formal theoretical computational analysis? To clarify, not just in the case where n is small, but in the case where n is never big.

One might say that for such small n it doesn't matter but I'm interested in the cases where it does, for example with a large constant such as being executed many times, and where in practice it would show a clear trend and be the dominant factor. For example the initial quadratic growth seen in the graph below. I'm not dismissing Big O, more asking for a way to properly tell both sides of the story.

enter image description here


[EDIT]

If for "small n", constants can easily remove all trace of a growth rate then either

  1. only the asymptotic case is discussed, in which case there is less relevance to any practical application, or
  2. there must be a threshold at which we agree n is no longer "small".

What about the cases where n is not "small" (n is sufficiently big that the growth rate will not to affected significantly by any practical constant), but not yet big enough to show the final asymptotic growth rate so only sub growth rates are visible (for example the shape in the image above)?

Are there no practical algorithms that exhibit this behaviour? Even if there aren't, theoretical discussion should still be possible. Do we measure instead of discussing the theory purely because that's "what one should do"? If some behaviour is observed in all practical cases, why can't there be theory that's meaningful?


Let me turn the question around the other way. I have a graph that shows segmented super-linear steps. It sounds like many people would say "this is a pure coincidence, it could be any shape imaginable" (at the extreme of course) and wouldn't bat an eyelid if it were a sine wave instead. I know in many cases the shape could be hidden by constants, but here it's quite obvious. How can I give a formal explanation of why the graph produces this shape?

I particularly like @Sneftel's words "imprecise but useful guidance".

I know Big O and asymptotic analysis isn't applicable. What is? How far can I take it?

Discuss in chat

Community
  • 1
  • 1
jozxyqk
  • 16,424
  • 12
  • 91
  • 180
  • Empirical experiments and [statistical significance](http://en.wikipedia.org/wiki/Statistical_significance), basically for real world usage. – amit Feb 21 '14 at 13:03
  • Also, the problem with talking about complexity of small inputs is that it is very dependent on a lot of things, including the machine running the algorithm. For example - how long will doing x1+y1, x2+y2, x3+y3, ... x16+y16 take? The answer is very dependent. A machine with vectors additions (which is becoming more and more common) will take significantly shorter time than a machine without this feature. You cannot determine what the constants are without knowing EXACYLY where it will be ran, and the internals of this machine, and even then - due to cache behavior, you might be limited. – amit Feb 21 '14 at 13:07
  • Asymptotic analysis is preferred for two excellent reasons: 1) it is machine independent and 2) it is mathematically tractable. With modern machines, the only alternative is benchmarking, if not despair :-( –  Feb 25 '14 at 08:51

7 Answers7

6

For small n, computation complexity - how things change as n increases towards infinity - isn't meaningful as other effects dominate.

Papers I've seen which discuss behaviour for small values of n do so by measuring the algorithms on real systems, and discuss how the algorithms perform in practice rather than from a theoretical viewpoint. For example, for the graph you've added to your post I would say 'this graph demonstrates an O(N) asymptotic behaviour overall, but the growth within each tile is bounded quadratic'.

I don't know of a situation where a discussion of such behaviour from a theoretical viewpoint would be meaningful - it is well known that for small n the practical effects outweigh the effects of scaling.

Pete Kirkham
  • 48,893
  • 5
  • 92
  • 171
  • You will get tons of upvotes because this is the conventional answer, but it does not answer the question of suggesting a way to deal with these cases, just discarding it as "it is not important". – amit Feb 21 '14 at 13:09
  • @amit I didn't intend to say "it is not important". I'm saying it "it is not meaningful". One one-hundreth of a dollar bill in not worth a cent. – Pete Kirkham Feb 21 '14 at 13:12
  • You are still not answering the question. It is not about its meaningfulness, it is about how to deal with it, a cliche won't make the answer any more relevant. If you think it is meaningless - vote to close. – amit Feb 21 '14 at 13:14
  • @amit the answer is to 'discuss how the algorithms perform in practice' rather than using computational complexity – Pete Kirkham Feb 21 '14 at 13:17
  • @amit there are countless questions on SO where the OP's suggestion was wrong and alternatives say 'don't do that, do this instead'. If you don't like that being the answer here, down vote it for the reason you give and we can agree to differ. – Pete Kirkham Feb 21 '14 at 13:24
  • @PeteKirkham, I think I'm starting to see what you're getting at. Would you be able to elaborate... http://chat.stackoverflow.com/rooms/48218/computational-complexity-for-small-n – jozxyqk Feb 24 '14 at 06:01
6

It's important to remember that asymptotic analysis is an analytic simplification, not a mandate for analyzing algorithms. Take selection sort, for instance. Yes, it executes in O(n^2) time. But it is also true that it performs precisely n*(n-1)/2 comparisons, and n-1-k swaps, where k is the number of elements (other than the maximum) which start in the correct position. Asymptotic analysis is a tool for simplifying the (otherwise generally impractical) task of performance analysis, and one we can put aside if we're not interested in the "really big n" segment.

And you can choose how you express your bounds. Say a function requires precisely n + floor(0.01*2^n) operations. That's exponential time, of course. But one can also say "for data sizes up to 10 elements, this algorithm requires between n and 2*n operations." The latter describes not the shape of the curve, but an envelope around that curve, giving imprecise but useful guidance about the practicalities of the algorithm within a particular range of use cases.

Sneftel
  • 40,271
  • 12
  • 71
  • 104
  • +1. Basically, it's possible to model an algorithm running on a machine in as abstract or as detailed a way as you want, but there are advantages and disadvantages to both. At the most detailed level, you model numbers of clock cycles per CPU instruction etc., and don't even allow such simplifyling assumptions as T(A then B) = T(A) + T(B) (because A and B might be accesses to nearby memory location, in which case the first access will enable the second to be satisfied from cache). At the other extreme is Big-Oh. In the middle might be counting 2 different types of op (comparisons and swaps). – j_random_hacker Feb 27 '14 at 06:44
  • Thanks for your time! One of the few answers that actually addresses my issue without "you just don't/can't". I'm quite aware of what asymptotic analysis is for, but want to know the language to use for describing algorithms at the smaller level precisely because that's where asymptotic analysis doesn't apply. I guess "imprecise but useful guidance" is the thing I'm after. – jozxyqk Mar 03 '14 at 06:21
  • There's nothing as one-size-fits-all as big-O, unfortunately. I have found, however, that the "potential" method of amortized analysis has been useful when looking at limited areas of an algorithm's domain. – Sneftel Mar 03 '14 at 10:05
1

Complexity is not about execution time for one n on one machine, so there is no need to consider it even if constant is large. Complexity tells you how the size of the input affects execution time. For small n, you can treat execution time as constant. This is the one side.

From the second side you are saying that:

  1. You have a hybrid algorithm working in O(n log n) for n larger than some k and O(n^2) for n smaller than k.
  2. The constant k is so large that algorithm works slowly.

There is no sense in such algorithm, because you could easily improve it.

Lets take Red-black tree. Operations on this tree are performed in O(n log n) time complexity, but there is a large constant. So, on normal machines, it could work slowly (i.e. slower than simpler solutions) in some cases. There is no need to consider it in analyzing complexity. You need to consider it when you are implementing it in your system: you need to check if it's the best choice considering the actual machine(s) on which it will be working and what problems it will be solving.

AndyG
  • 39,700
  • 8
  • 109
  • 143
Ari
  • 3,101
  • 2
  • 27
  • 49
1

You are right.

For small n, i.e. when only insertion sort is performed, the asymptotic behavior is quadratic O(n^2).

And for larger n, when tiled merge sort enters into play, the behavior switches to O(n.Log(n)).

There is no contradiction if you remember that every behavior has its domain of validity, before the switching threshold, let N, and after it.

In practice there will be a smooth blend between the curves around N. But in practice too, that value of N is so small that the quadratic behavior does not have enough "room" to manifest itself.

Another way to deal with this analysis is to say that N being a constant, the insertion sorts take constant time. But I would disagree to say that this is a must.

  • 1
    Unfortunately, to say "for small n ... behavior is O(n^2)" does not make any sense at all. The definition of O(.) starts with "there exists n such that...". It does not allow for any limitation on n. – Gene Feb 27 '14 at 02:38
  • From a technical standpoint, you are right. Unless you allow N to be as large as you want (while still under n). The OP is asking for a way to characterize complexity for small n. Merely answering O(1) is spoiling good information that we have: for n –  Feb 27 '14 at 07:33
  • 1
    For any parameter with an upper bound, behavior _is always_ O(1). Big-O is big-O. You shouldn't misuse it to express your own intuitive notion of algorithm behavior. That just creates confusion, which we have seen frequently in SO postings. It's easy to re-state what you mean without the existing imprecision. – Gene Feb 27 '14 at 13:58
1

Let's unpack things a bit. Big-O is a tool for describing the growth rate of a function. One of the functions to which it is commonly applied is the worst-case running time of an algorithm on inputs of length n, executing on a particular abstract machine. We often forget about the last part because there is a large class of machines with random-access memory that can emulate one another with only constant-factor slowdown, and the class of problems solvable within a particular big-O running-time bound is equivalent across these machines.

If you want to talk about complexity on small inputs, then you need to measure constant factors. One way is to measure running times of actual implementations. People usually do this on physical machines, but if you're hardcore like Knuth, you invent your own assembly language complete with instruction timings. Another way is to measure something that's readily identifiable but also a useful proxy for the other work performed. For comparison sorts, this could be comparisons. For numerical algorithms, this is often floating-point operations.

David Eisenstat
  • 64,237
  • 7
  • 60
  • 120
1

Read Knuth's "The Art of Computer Programming series", starting with "Volume 1. Fundamental Algorithms", section "1.2.10: Analysis of an Algorithm". There he shows (and in all the rest of his seminal work) how exact analysis can be conducted for arbitrary problem sizes, using a suitable reference machine, by taking a detailed census of every processor instruction.

Such analyses have to take into account not only the problem size, but also any relevant aspect of the input data distribution which will influence the running time. For simplification, the analysis are often limited to the study of the worst case, the expected case or the output-sensitive case, rather than a general statistical characterization. And for further simplification, asymptotic analysis is used.

Not counting the fact that except for trivial algorithms the detailed approach is mathematically highly challenging, it has become unrealistic on modern machines. Indeed, it relies on a processor behavior similar to the so-called RAM model, which assumes constant time per instruction and per memory access (http://en.wikipedia.org/wiki/Random-access_machine). Except maybe for special hardware combined to a hard real-time OS, these assumptions are nowadays completely wrong.

1

When you have an algorithm with a time complexity say O(n^2).And you also have an another algorithm with a time complexity, say O(n).Then from these two time complexity you can't conclude that the latter algorithm is faster than the former one for all input values.You can only say latter algorithm is asymptotically(means for sufficiently large input values)faster than the former one.Here you have to keep in mind the fact that in case of asymptotic notations constant factors are generally ignored to increase the understand-ability of the time complexity of the algorithm.As example: marge sort runs in O(nlogn) worst-case time and insertion sort runs in O(n^2) worst case time.But as the hidden constant factors in insertion sort is smaller than that of marge sort, in practice insertion sort can be faster than marge sort for small problem sizes on many machines.

Asymptotic notation does not describe the actual running-time of an algorithm.Actual running time is dependent on machine as different machine has different architecture and different Instruction Cycle Execution time.Asymptotic notation just describes asymptotically how fast an algorithm is with respect to other algorithms.But it does not describe the behavior of the algorithm in case of small input values(n<=no).The value of no (threshold) is dependent on the hidden constant factors and lower order terms.And hidden constant factors are dependent on the machine on which it will be executed.

Tanmoy Banerjee
  • 1,433
  • 3
  • 22
  • 31