0

There are two independent nested structures being iterated and connected somehow. The inner operations are hash table lookups.

# struct_a has unbounded length, but could be zero
for row in struct_a:

    # row.subAttr has unbounded length, but could be zero
    for sub in row.subAttr:
        ...

# struct_b has unbounded length, but could be zero
for row in struct_b:

    # row.subAttr has unbounded length, but could be zero
    for sub in row.subAttr:
        ...

What is the Big-O in this case (variable lengths)?

rodrigo-silveira
  • 12,607
  • 11
  • 69
  • 123

1 Answers1

0

The length of struct_a or row will ultimately be reduced to a constant in Big O notation. The mathematical benefit of algorithmic analysis using Big O notation is that we do not worry about the constants - in fact, they are usually dropped from the term. A function that takes 5n2 is simply expressed as O(n2).

In your case, no matter the length of struct_a or row (even zero or a very large number), these will ultimately be constants. The double loop is the main factor in Big O notation, and the above code has a complexity of O(n2). One n term is contributed by the outer loop, and one n term is contributed by the inner loop.

In some situations where the difference between the outer and inner loop is very large, you might express the above as O(mn), to illustrate the fact that the inner and outer loops have different lengths (m represents the length of one of the loops, and n represents the length of the other). In many use cases, the difference in lengths between the two loops is negligible on average, and so m*n is simplified to n2. In this case, there is an implicit understanding that even though you are using the single variable n, the two loop lengths may be different.

Obviously, the final runtime will be affected by the length of the structures, whether they are zero or very long. There are many functions with higher Big O complexities (even n3 or n!) that take much less time in real life than simpler functions (log(n) or n) simply due to constant factors.

This answer is related and illustrates the same point.

Brent George
  • 108
  • 2
  • 11
  • Isn't it important to identify what *n* represents? – Scott Hunter Jun 02 '22 at 16:28
  • It depends on the context whether you would specify that two different variables are being used. In some uses cases where the difference between the outer and inner loop is very large, you might express the above as O(mn), to illustrate the fact that the inner and outer loops have different lengths. In many use cases, the difference in lengths between the two loops is negligible on average, and so m*n is simplified to n^2 – Brent George Jun 02 '22 at 16:29
  • 1
    Since you have no idea what this particular context is, shouldn't you not make an assumption that this simplification is justified? – Scott Hunter Jun 02 '22 at 16:32
  • 1
    I added further clarification to the answer to remove this assumption. Is that more clear @ScottHunter? – Brent George Jun 02 '22 at 16:37