For what values of m (in terms of n) is O(m*n) better than O(n^2) (and why)?

Question

I usually am able to answer this sort of question by graphing on Desmos, but I'm having trouble figuring out how to represent these graphs to visualise the difference in steepness, especially as m tends to n and m goes beyond n.

An example where this can come up is "compare two arrays to find if there are any duplicates from the first array in the second array". Here, m, and n are the lengths of the arrays. If m and n are equal, then the complexity is n^2 for a nested-loop pairwise comparison solution. But if they are unequal, is the complexity graph 'worse' than n^2? I can't seem to figure out how to graph this to check, or even express this mathematically.

One great explanation I got from a chat room: `you can express this as "solve the inequality m*n < n*n", and you see that it is true for the cases where m < n`. I feel like a fool :D but I have my answer now. Any answers with more insights are still welcome! — Aditya M P, Jul 08 '19 at 05:40
The point I'm making in my answer is that `n^2` does not express the complexity for any possible solution to the problem "compare two arrays to find if there are any duplicates", so comparing the complexity has no meaning. If `m` and `n` _happen_ to be equal, you still don't express the complexity in terms of `n`, you must express it as O(m*n). — Patrick Roberts, Jul 08 '19 at 05:50
@PatrickRoberts noted, that makes even more sense. Thank you! — Aditya M P, Jul 08 '19 at 05:51
@PatrickRoberts is this why I can’t figure out how to graph them comparatively? Like say comparing `n^2` vs `log n`? — Aditya M P, Jul 08 '19 at 05:53

Shridhar R Kulkarni · Answer 1 · 2019-07-08T05:51:19.857

It totally depends on what m is. As we are talking about time complexity here, without even plotting the graph, you can say that O(m*n) is better than O(n^2) when m < n. For example, m could be O(logn),O(n^(1/2)), etc.

After OP edited the question:

I'll talk about the compare two arrays to find if there are any duplicates from the first array in the second array part. Let size of A be m, size of B be n.

vector<int> find_duplicates(vector<int> A, vector<int> B){
    vector<int> duplicates;
    unordered_map<int, int> frequency;
    for(auto num: B){
        frequency[num]++;
    }
    for(auto num: A){
        if(frequency[num] > 0){
            duplicates.push_back(num);
            frequency[num]--;
        }
    }
    return duplicates;
}

The for(auto num: B){...} has time complexity O(n). The for(auto num: A){...} has time complexity O(m). But m > n or m < n or m = n. So the total time complexity is O(max(m,n)).

In this solution O(max(m,n)) < O(m*n). Hope this cleared your doubts.

Right... close. But I was hoping for a bit more rigorous statement. — Aditya M P, Jul 08 '19 at 05:38
@AdityaMP: After you edited the question, I have updated my answer. I could give you the most rigorous bound after you gave a concrete example. — Shridhar R Kulkarni, Jul 08 '19 at 05:52

Patrick Roberts · Accepted Answer · 2019-07-08T05:57:39.400

n and m are totally independent variables. If m could be expressed in terms of n, then the complexity would be expressed purely in terms of n.

A common scenario for independent variables n and m is to define the length of two arbitrarily sized arrays.

It doesn't make sense to compare these time complexities to each other at all because they refer to mutually exclusive classes of problems.

Using your example, O(m*n) defines the complexity of one possible solution for the problem

compare two arrays to find if there are any duplicates from the first array in the second array

However, the two arrays still have independent lengths, so even if they happen to be equal in a particular case, it is not valid to claim that the algorithm is now O(n^2). n and m are still independent variables, unless it is explicitly stated in the problem that they share the same length.

Because every possible solution must iterate each array at least once, and neither array has a constant length, and neither length is expressed in terms of the other, there is no possible solution to the problem above whose complexity can be defined purely in terms of n, therefore trying to reason about the relative efficiency of each complexity has no meaning.

`If m could be expressed in terms of n, then the complexity would be expressed purely in terms of n.` That's a fair point, I just wrote down an example to illustrate what precisely I'm struggling with. — Aditya M P, Jul 08 '19 at 05:37

score 1 · Answer 3 · answered Jul 08 '19 at 09:53

For any given constant values of m and n, we can't say anything about which one will be faster. This is because Big-O, by definition, only applies as values tend to infinity. So going forward I'll assume we're talking about what happens when the values tend to infinity.

We want to know when is O(m*n) faster than O(n^2). The formal definitions of Big-O checks either f(n) <= k.g(n) or f(n)/g(n). In either case, since n > 0, it's easy to see we can just cancel out one n from both m*n and n^2 and end up with an equivalent check.

So now we want to find when O(m) is faster than O(n), i.e. m is less than n for every value of n beyond some point.

This is, in fact, just the same as saying m = o(n), which is your answer. Note: that is little-O.

In less abstract terms, m can be any function of n "less than" n (by more than a constant factor). To give some examples:

m = 1 works.
m = log n works.
m = sqrt(n) works.
m = n / 231689 doesn't work (differs by a constant factor).
m = n - 9873054 doesn't work (differs by a constant factor).
m = n^2 doesn't work (m grows faster).

For what values of m (in terms of n) is O(m*n) better than O(n^2) (and why)?

3 Answers3