1

I've been doing a lot of studying from many different resources on algorithm analysis lately, and one thing I'm currently confused about is why time complexity is often defined in terms of the number of steps/operations an algorithm performs.

For instance, in Introduction to Algorithms, 3rd Edition by Cormen, he states:

The running time of an algorithm on a particular input is the number of primitive operations or “steps” executed. It is convenient to define the notion of step so that it is as machine-independent as possible.

I've seen other resources define the time complexity as such as well. I have a problem with this because, for one, it's called TIME complexity, not "step complexity" or "operations complexity." Secondly, while it's not a definitive source, an answer to a post here on Stackoverflow states "Running time is how long it takes a program to run. Time complexity is a description of the asymptotic behavior of running time as input size tends to infinity." Further, on the Wikipedia page for time complexity it states "In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm." Again, these are definitive sources, things makes logical sense using these definitions.

When analyzing an algorithm and deriving its time complexity function, such as in Figure 1 below, you get an equation that is in units of time. It CAN represent the amount of operations the algorithm performs, but only if those constant factors (C_1, C_2, C_3, etc.) are each a value of 1.

Figure 1 Figure 1

So with all that said, I'm just wondering how it's possible for this to be defined as the number of steps when that's not really what it represents. I'm trying to clear things up and make the connection between time and number of operations. I feel like there is a lot of information that hasn't been explicitly stated in the resources I've studied. Hoping someone can help clear things up for me, and without going into discussion about Big-O because that shouldn't be needed and misses the point of the question, in my opinion.

Thank you everyone for your time and help.

GainzNerd
  • 312
  • 1
  • 10
  • One big reason is that different computer configurations enable different operation times, variations in CPU performance therefore affect directly the speed of the same nr of operations execution – Petronella Apr 08 '21 at 08:49
  • might be better on [cs.se], flag for a mod to migrate if you agree – AakashM Apr 08 '21 at 08:50

3 Answers3

2

why time complexity is often defined in terms of the number of steps/operations an algorithm performs?

TL;DR: because that is how the asymptotic analysis work; also, do not forget, that time is a relative thing.


Longer story:

Measuring the performance in time, as we, humans understand the time in a daily use, doesn't make much sense, as it is not always that trivial task to do.. furthermore - it even makes no sense in a broader perspective.

How would you measure what is the space and time your algorithm takes? what will be the conditional and predefined unit of the measurement you're going to apply to see the running time/space complexity of your algorithm?

You can measure it on your clock, or use some libraries/API to see exactly how many seconds/minutes/megabytes your algorithm took.. or etc.

However, this all will be VERY much variable! because, the time/space your algorithm took, will depend on:

  1. Particular hardware you're using (architecture, CPU, RAM, etc.);
  2. Particular programming language;
  3. Operating System;
  4. Compiler, you used to compile your high-level code into lower abstraction;
  5. Other environment-specific details (sometimes, even on the temperature.. as CPUs might be scaling operating frequency dynamically)..

therefore, it is not the good thing to measure your complexity in the precise timing (again, as we understand the timing on this planet).

So, if you want to know the complexity (let's say time complexity) of your algorithm, why would it make sense to have a different time for different machines, OSes, and etc.? Algorithm Complexity Analysis is not about measuring runtime on a particular machine, but about having a clear and mathematically defined precise boundaries for the best, average and worst cases.

I hope this makes sense.

Fine, we finally get to the point, that algorithm analysis should be done as a standalone, mathematical complexity analysis.. which would not care what is the machine, OS, system architecture, or anything else (apart from algorithm itself), as we need to observe the logical abstraction, without caring about whether you're running it on Windows 10, Intel Core2Duo, or Arch Linux, Intel i7, or your mobile phone.

What's left?

Best (so far) way for the algorithm analysis, is to do the Asymptotic Analysis, which is an abstract analysis calculated on the basis of input.. and that is counting almost all the steps and operations performed in the algorithm, proportionally to your input.

This way you can speak about the Algorithm, per se, instead of being dependent on the surrounding circumstances.


Moreover; not only we shouldn't care about machine or peripheral factors, we also shouldn't care about Lower Order Terms and Constant Factors in the mathematical expression of the Asymptotic Analysis.

Constant Factors:

Constant Factors are instructions which are independent from the Input data. i.e. which are NOT dependent on the input argument data.

Few reasons why you should ignore them are:

  1. Different programming language syntaxes, as well as their compiled files, will have different number of constant operations/factors;
  2. Different Hardware will give different run-time for the same constant factors.

So, you should eliminate thinking about analyzing constant factors and overrule/ignore them. Only focus on only input-related important factors; therefore:

  1. O(2n) == O(5n) and all these are O(n);
  2. 6n2 == 10n2 and all these are n2.

One more reason why we won't care about constant factors is that they we usually want to measure the complexity for sufficiently large inputs.. and when the input grows to the + infinity, it really makes no sense whether you have n or 2n.

Lower order terms:

Similar concept applies in this point:

Lower order terms, by definition, become increasingly irrelevant as you focus on large inputs.

When you have 5x4+24x2+5, you will never care much on exponent that is less than 4.

Giorgi Tsiklauri
  • 9,715
  • 8
  • 45
  • 66
0

Time complexity is not about measuring how long an algorithm takes in terms of seconds. It's about comparing different algorithms, how they will perform with a certain amount if input data. And how this performance develops when the input data gets bigger.

In this context, the "number of steps" is an abstract concept for time, that can be compared independently from any hardware. Ie you can't tell how long it will take to execute 1000 steps, without exact specifications of your hardware (and how long one step will take). But you can always tell, that executing 2000 steps will take about twice as long as executing 1000 steps.

And you can't really discuss time complexity without going into Big-O, because that's what it is.

derpirscher
  • 14,418
  • 3
  • 18
  • 35
0

You should note that Algorithms are more abstract than programs. You check two algorithms on a paper or book and you want to analyze which works faster for an input data of size N. So you must analyze them with logic and statements. You can also run them on a computer and measure the time, but that's not proof.

Moreover, different computers execute programs at different speeds. It depends on CPU speed, RAM, and many other conditions. Even a program on a single computer may be run at different speeds depending on available resources at a time.

So, time for algorithms must be independent of how long a single atomic instruction takes to be executed on a specific computer. It's considered just one step or O(1). Also, we aren't interested in constants. For example, it doesn't matter if a program has two or 10 instructions. Both will be run on a fraction of microseconds. Usually, the number of instructions is limited and they are all run fast on computers. What is important are instructions or loops whose execution depends on a variable, which could be the size of the input to the program.

Ahmad
  • 8,811
  • 11
  • 76
  • 141