8

I like to know whether it is possible to "write a program or algorithm" to find the time complexity of any given program taken as input.

Input : any program (P) [in any language or of a particular language]

Output : time complexity of that program (P).

Have there been any prior attempts to write such a program? Are there any algorithms available for this purpose?

If so please provide the necessary links, references or with any kind of guidance possible.

user unknown
  • 35,537
  • 11
  • 75
  • 121
  • 1
    Similiar to this post http://stackoverflow.com/questions/480775/programmatically-obtaining-big-o-efficiency-of-code – Anna Sep 22 '09 at 16:36
  • And http://stackoverflow.com/questions/7331801/why-does-the-halting-problem-make-it-impossible-for-software-to-determine-the-ti – pwan Jul 25 '12 at 13:06

4 Answers4

30

No. It's not possible. This is a form of the halting problem.

erickson
  • 265,237
  • 58
  • 395
  • 493
  • 2
    +1: Until you can prove the program P halts, you can't prove anything about it's time complexity. – S.Lott Sep 22 '09 at 16:35
  • 1
    But for a non-turing complete language it might be possible :) (and it's an open question how useful such languages might be) – tonfa Sep 22 '09 at 17:04
  • Ignoring the theoretical bs about the halting problem. It would seem that you could make a program that would give you a rough estimate and just assume the program will eventually ends. – corymathews Sep 22 '09 at 17:17
  • @corymatthews: No. You can't get a "rough estimate". You either have a program that you can argue for the complexity -- as a mathematical exercise -- and be right. Or you have a program which you cannot prove terminates; in which case you can't do anything more. – S.Lott Sep 22 '09 at 17:32
  • 2
    If cory's program existed, then even if it only worked for problems known to halt, then it would solve that irritating P=NP question once and for all. Just run it against the "try every single program that exists in pseudo-parallel" algorithm for the travelling salesman problem, and find out whether that has polynomial complexity. If so, P=NP, if not P!=NP. Since P=NP is not solved, no such program exists. – Steve Jessop Sep 22 '09 at 18:19
  • @SteveJessop: You're wrong, your method could not decide P=NP, mainly because there are infinitely many such programs. – jpalecek Apr 18 '12 at 16:43
  • @jpalacek: I'm talking about a program whose input is an instance of TSP, and that dovetails every single program that exists (of which there are countably many), and tests the result of each (as and when it completes, if it does) to see whether it solves that instance of TSP. There's more than one way to write that program, but all of them have the property that they have polynomial complexity if and only if P=NP. So if cory's oracle exists, see what "rough estimate" it gives for one such program, and you have a "rough estimate" of whether P=NP. Whatever "rough estimate" means. – Steve Jessop Apr 18 '12 at 21:29
  • Any other NP-complete problem would do in place of TSP, of course -- the decision version of TSP is NP-complete. The whole point of the dovetailing technique is to run countably many programs "in parallel" on a deterministic machine. Note that my whole proposal falls squarely under the heading of "theoretical bs about the halting problem" that cory sets aside - he wasn't seriously proposing an infallible test, I'm just pointing out how I know that no such infallible test is known. – Steve Jessop Apr 18 '12 at 21:41
  • this has **nothing to do** with the halting problem, where the decision *subroutine* must be able to refer to the whole program which must be able to use/refer to the decision subroutine (only then the paradoxical degenerate program can be constructed proving the impossibility of the halting decision subroutine). Here the decision *program* is entirely extraneous to the program being analyzed which doesn't even know about the decision program, much less is able to refer to it. – Will Ness Jul 25 '12 at 12:04
4

Proving the complexity of an arbitrary algorithm is not possible but in principle you could estimate it:

  1. choose n, the size of the input
  2. run the algorithm
  3. observe t(n), the time needed to run the algorithm for input of size n
  4. choose another n, repeat steps 2 and 3 until you have a lot of data
  5. regress t(n) on n, n^k, log(n), n log(n), n!, or any other term that might be appropriate
  6. choose a term with statistical significance and declare that to be your estimated complexity of the algorithm

There are any number of pitfalls to this approach

  1. This will not prove anything, only estimate
  2. If t(n) gets really large for large n, it will take a long time to collect enough data for your analysis
  3. There are many ways that this approach can be fooled unless you use huge values of n. For example, this algorithm will look like O(1) unless you use astronomical values for n

    sleep for 10 days
    run an O(n log(n)) sorting algorithm
    

And other SO users can come up with many more. But in some cases, like when you do a complexity analysis of an algorithm and want to verify it empirically, this is still a useful approach.

Georg Schölly
  • 124,188
  • 49
  • 220
  • 267
mob
  • 117,087
  • 18
  • 149
  • 283
  • 1
    The problem is step 3: you can't be certain that the program will *ever* halt for arbitrary input. This is the halting problem. The consequence is that even this estimation method won't, in general, produce a useful measure. – Bob Cross Sep 22 '09 at 17:15
  • Worse, there are terminating algorithms that will still take until heat death of the universe to produce their answer. You should probably know this before you start trying to run it and gather data. – S.Lott Sep 22 '09 at 17:33
  • True, but if you had a tolerance say MAX t(n) the method could be used to make the estimate for times less than t(n) (Just force kill the process after Max t(n) time). It won't tell you if the program will halt, but you will discover either the running time of the algorithm of get a result of "this program runs so long that it's effectively non-usable for me" . – Streklin Sep 22 '09 at 17:50
  • 1
    I wouldn't say this isn't useful in general. Most algorithms that you want to benchmark will trivially halt. As mobrule pointed out, this doesn't prove anything about runtimes, but it does provide a useful statistical test. – Nick Johnson Sep 22 '09 at 19:47
  • And for practical purposes, asymptotic behaviour is in any case completely, utterly irrelevant. I have literally never written a program where I care about its runtime if that runtime exceeds a year, and a year is well short of the limit as N tends to infinity. If your so-called algorithm doesn't halt for some inputs, then you have worse problems than that you can't measure its apparent complexity (for small N) by this method... – Steve Jessop Sep 23 '09 at 01:31
  • local empirical asymptotic behaviour is a useful extrapolation measure for running time, and easy enough to find as `k = log (t2/t1) / log (n2/n1)` for local `O(n^k)` behaviour (taken as run times `t1,t2` at problem sizes `n1,n2`). – Will Ness Jan 28 '12 at 15:21
0

can't we introduce some more variables in the algorithm itself and find the complexity the same way we do it manually. like for example we can have a variable in an insertion sort say n=0 which keeps the track of the entire looping part of the algorithm and then give the answer as O(n^2)

Manthan Shah
  • 119
  • 2
  • 7
  • 1
    you still have to run it on many different inputs. It being self-observant instead of you manually calculating the averages does't change *that* fact. But if your routine lives inside some kind of dynamic system which is constantly in use then yes, self-observancy/awareness is a very good idea, IMO. The *supervisor* part of a system would observe and detect which parts need improvement, and choose another algorithm. – Will Ness Jul 25 '12 at 11:41
-3

Well I think you do this by assuming all cases. 1:Like first make a module which can extract each of the instruction 2: Do make a database of instruction to match up with the program instruction. 3: calculate the complexity by fetching the appropriate time complexity set by you in database and that's all.

Shahid
  • 1