1

Title basically, can we design a program that inputs another code file, say for example a Python program, and can tell you its time complexity?

The program can read the program word for word and indent for indent, and can count how many for or while statements it encounters. It can then see if they are nested for quadratic time. I feel like it's not like the Halting problem, since we're not looking to see if it will end, just its time complexity. But, what about algorithms that implement recursion? Would such a program be possible to write still?

Sorry if this seems like a silly questions, I was pondering this and was thinking of trying to write it myself.

Sebastian
  • 1,834
  • 2
  • 10
  • 22
  • The minimum O notation can be theoretically/structurally deduced only from a subset of problems, not for every possible program/algorithm. But for some of those limited cases, you could write a program, which perfectly solves it. The more the program flow depends on the result of the calculations, the more difficult it gets. Additionally/instead you can measure the running time dependency for large inputs. But beside the practical problems, this is still a heuristic and no proof and the problem can be O(1) with just huge constants and your input was still too small. – Sebastian Jul 19 '22 at 08:19
  • This question has important practical uses beside knowing the time complexity itself: Generating static analyzers, optimizers and correctness proofs. See also here https://matt.might.net/articles/intro-static-analysis/ – Sebastian Jul 19 '22 at 08:29
  • If you can see a program's time complexity, then you can see if it will halt or not (finite vs infinite time complexity). In other words, your program could be used to solve the halting problem. Therefore, your program cannot exist :) – Berthur Jul 19 '22 at 11:14

2 Answers2

3

Mostly not.

"Doesn't end" is just another time complexity. And therefore you can't write a program that determines how quickly things halt without determining first whether they halt at all.

Furthermore it doesn't take long to get into very hard problems. For example consider the following function:

def hailstone(n):
    answer = 0
    while 1 < n:
        answer += 1
        if 0 == n % 2:
            n = n // 2
        else:
            n = 3*n + 1
    return answer

Just a while and an if. But if you can tell me that this runs in time O(n), then you've just solved the Collatz Problem.

That said, it IS possible to produce an upper bound for some useful code. However said upper bound has to be infinity for a lot of code.

btilly
  • 43,296
  • 3
  • 59
  • 88
1

you can do this by measuring runtimes for datasets of different sizes and then fit the complexity like this:

however you still need to create meaningful input data in order this to work reliably, also you need precise enough time measurements , and also do not forget to flush CACHEs in case of reusing the same input data...

So you should pay attention to:

  • dataset size
  • dataset content
  • measured times should be at least ~100 ms

I recommend to see these:

also algorithms follow complexity curve usually only after some threshold dataset size so you might want to add some kind of detection of that ... and then use only bigger datasets.

However note that obtained complexity is not really the raw complexity of used algorithm but a coumulation of both algorithm and used computing architecture features so the results might differ from what you expect.

Spektre
  • 49,595
  • 11
  • 110
  • 380
  • Good luck with the Collatz example that I gave. This kind of numerical experiment is indicative but never a proof. – btilly Jul 20 '22 at 03:35
  • @btilly I do not see a problem there ... you can run the measured algorithm in thread and kill it from parent process after it timeouts ... – Spektre Jul 20 '22 at 06:51
  • And then the O notation would be `O(∞)`? – Sebastian Jul 20 '22 at 07:43
  • @Sebastian no ... the result will be unknown due to timeout ... "infinite" and iterative solvers does not really have `n` we can only talk about complexity of its parts and and not as a whole code ... so even if code is parsed and understood (as OP suggest) the result will not be what you expect anyway ... for example imagine `for (int i=0;;){ i^=random(); }` ... what is the complexity of that? it certainly is not `O(n)` as there is no `n` ... the inner part is `O(1)` ... – Spektre Jul 20 '22 at 08:30
  • @Spektre Even if you run the first billion, you don't know what a billion+1 will do. You will produce an answer. But you will have no way to determine if it is correct. – btilly Jul 20 '22 at 15:00
  • @btilly that is why the answer would be `unknown due to timeout` ... its not incorrect one ... – Spektre Jul 20 '22 at 15:17
  • @Spektre The thing is that ALL input functions suffer from the same problem. If you produce an answer other than `unknown` for any of them based on a blackbox experiment like this, that answer is sometimes wrong. – btilly Jul 20 '22 at 15:25