251

Are there are any cases where you would prefer O(log n) time complexity to O(1) time complexity? Or O(n) to O(log n)?

Do you have any examples?

V.Leymarie
  • 2,708
  • 2
  • 11
  • 18
  • 68
    I would prefer an `O(log n)` algorithm to an `O(1)` algorithm if understand the former, but not the latter... – Codor Dec 09 '15 at 13:28
  • 14
    There's tons of impractical data structures with O(1) operations from theoretical computer science. One example would be select() on bitvectors, which can be supported in o(n) extra space and O(1) per operation, using 5 layers of indirection. Simple binary search combined with O(1) rank() turns out to be faster in practice according to the author of the [Succinct Data Structure Library](https://github.com/simongog/sdsl-lite) – Niklas B. Dec 09 '15 at 13:35
  • 17
    Lower asymptotic complexity does not guarentee faster runtimes. Research matrix multiplication for a concrete example. – Connor Clark Dec 09 '15 at 16:40
  • 55
    Also...any algorithm can be converted to O(1), given a sufficiently large table lookup ;) – Connor Clark Dec 09 '15 at 16:42
  • 5
    For a concrete example, [SmoothSort](https://en.wikipedia.org/wiki/Smoothsort) has best case `O(n)` performance and worst case `O(n log n)`, whereas [QuickSort](https://en.wikipedia.org/wiki/Quicksort) has best case `O(n log n)` performance and worst case `O(n ^ 2)` (!). Which do you see in practice? – Boris the Spider Dec 09 '15 at 17:08
  • 20
    @Hoten -- That's assuming the table lookup is O(1), which isn't a given at all for the size of the tables you're talking about! :) – Jander Dec 09 '15 at 22:16
  • 10
    @Jander it will be O(1), because it will be independent of the actual input. It will instead depend on the size of the input space (ie the domain of the function) – Connor Clark Dec 09 '15 at 22:20
  • 6
    I'll take an algorithm I can just call from the language's library over anything self-implemented or third-party any time, *unless it will be an actual performance problem in my application*. Quicker written, easier documented, easier maintained, better tested. – DevSolar Dec 10 '15 at 08:54
  • 1
    The difference between O(1) and O(log n) is only of interest to theoretical computer scientists. People working in the Real World have better things to worry about, like the difference between O(n) and O(n^2). – Stig Hemmer Dec 10 '15 at 09:45
  • 5
    @Hoten If the input space is bounded then every algorithm is $O(1)$. Complexity is only interesting when the input space isn't bounded, in which case your lookup table is certainly *not* $O(1)$. For example, any lookup table on a fixed-word-size architecture is $\Omega(\log(n))$, which is clear because any branching program with maximum fanout equal to the word size must be logarithmic in the number of outputs. – Reinstate Monica Dec 10 '15 at 18:16
  • 2
    @Hoten Umm. Say we have *any* kind of graph algorithm that depends on all nodes to decide the solution. The only way to look up the result in your lookup table is to visit every node in your input at least once (otherwise how do you know which solution to pick?). Hence you have a complexity of at least O(V). Now obviously if your input size is bounded, O(V) is a constant which is in O(1), but then the same thing is true for absolutely every algorithm and you don't need any kind of lookup table for that, – Voo Dec 11 '15 at 17:39
  • 1
    @Voo Definitely right. Upon further consideration, what I said is only true of some algorithms, where the inputs are scalar values. – Connor Clark Dec 11 '15 at 19:59
  • 1
    @Hoten ah but the size of your integers increases logarithmic to the input size ;) – Voo Dec 11 '15 at 20:13
  • @Voo Big-O analysis typically ignores implementation details (cache, memory, etc.) ;) – Connor Clark Dec 11 '15 at 20:21
  • 2
    @Hoten Has nothing to do with implementation details. If your input is a single number, if that number can be arbitrarily large, it requires O(log n) bits to be represented which requires O(log n) time to be read. This is ignored by many analyses for good reason, but if you're not careful and ignore this you can easily prove that you have a polynomial solution for a NP hard algorithm. See e.g. [Pseudo-polynomial time](https://en.wikipedia.org/wiki/Pseudo-polynomial_time). – Voo Dec 11 '15 at 20:27
  • Or if another sample is easier: The input for our aforementioned graph problem can easily be represented as a single number (after all that's what computer memory basically is) - hence by the same argument as before it cannot be in O(1). – Voo Dec 11 '15 at 20:30
  • The question is a bit too broad (as can be seen from the very vague answers which go like: speed, space, algorithmic length, ...). Maybe you have some more specific application areas in mind? – NoDataDumpNoContribution Dec 11 '15 at 21:35
  • @Voo You would have to have a table where the keys of the look up included the graph itself. It would have to enumerate all possible graphs, since the graph is part of the input. The only problem then is if the graphs that can be used is an infinite set. Realistically, in computer programming, we can make the assertion that the domain is technically finite since the computer is finite. So Hoten is still right, at least from a certain point of view. – jpmc26 Dec 11 '15 at 21:52
  • 1
    @jpmc26 As soon as you assume that the input domain is finite, it doesn't matter whether you use a lookup table or not - in that case, *all* algorithms are O(1) by definition! Quicksort, knappsack, travelling salesman - all O(1) and that without any lookup tables involved. And if you don't assume the domain is finite, the lookup table won't get most problems down to O(1) either (you can get them all down to O(n) with it though). – Voo Dec 11 '15 at 23:09
  • 1
    @Hoten Your statement about table lookup makes no sense because we use big-O notation to show the asymptotic running time **growth**, not running time itself. i.e. your expression has to be valid for arbitrarily large n. – user541686 Dec 12 '15 at 18:58
  • There's a cute data structure called the [Bloom filter](https://en.wikipedia.org/wiki/Bloom_filter) that can add and probabilistically check for the presence of an element in O(1) time, but has the possibility of incorrectly responding "present" when the element is not in fact present. Trees will give the answer with certainty in O(log n). – Iwillnotexist Idonotexist Dec 13 '15 at 02:11
  • 1
    I'd probably prefer an `O(log n)-time / O(1)-space` algorithm to an `O(1)-time / O(10^(n^999))-space` one :-) – paxdiablo Dec 13 '15 at 11:16
  • 1
    [This famous question](http://stackoverflow.com/questions/11227809/why-is-processing-a-sorted-array-faster-than-an-unsorted-array) is highly related and demonstrates a case where branch prediction effects caused a *log n* running time (sorting an array) to be faster than *n* (looping over the array), even for a rather large *n*. – Lii Dec 15 '15 at 08:21
  • 1
    Among [sorting networks](https://en.wikipedia.org/wiki/Sorting_network), the AKS network is theorists' best construction—depth O(log n) compared to O((log n)**2) for [Batcher sort](https://en.wikipedia.org/wiki/Batcher_odd%E2%80%93even_mergesort) and bitonic sort—but the constant makes it utterly impractical. Knuth: "Batcher's method is much better, unless n exceeds the total memory capacity of all computers on earth! – Colonel Panic Dec 15 '15 at 16:00
  • 1
    Your question mathematically is asking if a log(x) can be less than a constant or if x can be less than log(x) or a constant. Take a look at this plot and you tell me http://fooplot.com/plot/m3dwiddyny – Z boson Dec 15 '15 at 22:49
  • Sometimes the slower algorithm may have other properties which are important. For example, quicksort is generally faster than mergesort, but mergesort is inherently stable while quicksort isn't; thus, if the order of entries with equal keys matters mergesort may be preferable. – Bob Jarvis - Слава Україні Mar 20 '19 at 16:09

23 Answers23

281

There can be many reasons to prefer an algorithm with higher big O time complexity over the lower one:

  • most of the time, lower big-O complexity is harder to achieve and requires skilled implementation, a lot of knowledge and a lot of testing.
  • big-O hides the details about a constant: algorithm that performs in 10^5 is better from big-O point of view than 1/10^5 * log(n) (O(1) vs O(log(n)), but for most reasonable n the first one will perform better. For example the best complexity for matrix multiplication is O(n^2.373) but the constant is so high that no (to my knowledge) computational libraries use it.
  • big-O makes sense when you calculate over something big. If you need to sort array of three numbers, it matters really little whether you use O(n*log(n)) or O(n^2) algorithm.
  • sometimes the advantage of the lowercase time complexity can be really negligible. For example there is a data structure tango tree which gives a O(log log N) time complexity to find an item, but there is also a binary tree which finds the same in O(log n). Even for huge numbers of n = 10^20 the difference is negligible.
  • time complexity is not everything. Imagine an algorithm that runs in O(n^2) and requires O(n^2) memory. It might be preferable over O(n^3) time and O(1) space when the n is not really big. The problem is that you can wait for a long time, but highly doubt you can find a RAM big enough to use it with your algorithm
  • parallelization is a good feature in our distributed world. There are algorithms that are easily parallelizable, and there are some that do not parallelize at all. Sometimes it makes sense to run an algorithm on 1000 commodity machines with a higher complexity than using one machine with a slightly better complexity.
  • in some places (security) a complexity can be a requirement. No one wants to have a hash algorithm that can hash blazingly fast (because then other people can bruteforce you way faster)
  • although this is not related to switch of complexity, but some of the security functions should be written in a manner to prevent timing attack. They mostly stay in the same complexity class, but are modified in a way that it always takes worse case to do something. One example is comparing that strings are equal. In most applications it makes sense to break fast if the first bytes are different, but in security you will still wait for the very end to tell the bad news.
  • somebody patented the lower-complexity algorithm and it is more economical for a company to use higher complexity than to pay money.
  • some algorithms adapt well to particular situations. Insertion sort, for example, has an average time-complexity of O(n^2), worse than quicksort or mergesort, but as an online algorithm it can efficiently sort a list of values as they are received (as user input) where most other algorithms can only efficiently operate on a complete list of values.
David G
  • 94,763
  • 41
  • 167
  • 253
Salvador Dali
  • 214,103
  • 147
  • 703
  • 753
  • 6
    Also, I've seen a few times that people were focused on the big-O of their central algorithm, but ignoring the setup costs. Building a hash-table, for example, can be more expensive than going through an array linearly if you don't need to do it over and over again. In fact, due to the way modern CPUs are built, even something like binary search can be just as fast on sorted arrays as a linear search - profiling is a necessity. – Luaan Dec 12 '15 at 10:27
  • @Luaan "In fact, due to the way modern CPUs are built, even something like binary search can be just as fast on sorted arrays as a linear search - profiling is a necessity." Interesting! Can you explain how binary search and linear search could take the same amount of time on a modern cpu? – DJG Dec 12 '15 at 20:52
  • 3
    @Luaan - Never mind, I found this: https://schani.wordpress.com/2010/04/30/linear-vs-binary-search/ – DJG Dec 12 '15 at 20:58
  • 2
    @DenisdeBernardy: No, actually it doesn't. They could be algorithms in P. And even if these weren't, under reasonable definitions of what it means to parallelize, that wouldn't imply P != NP either. Also remember that search the space of possible runs of a non-deterministic turing machine is quite parallelizable. – einpoklum Dec 13 '15 at 17:31
228

There is always the hidden constant, which can be lower on the O(log n) algorithm. So it can work faster in practice for real-life data.

There are also space concerns (e.g. running on a toaster).

There's also developer time concern - O(log n) may be 1000× easier to implement and verify.

Nayuki
  • 17,911
  • 6
  • 53
  • 80
Alistra
  • 5,177
  • 2
  • 30
  • 42
  • Nice, thank you. I was thinking it could also be worth to consider a O(logn) algorithm to ensure program stability (e.g in self-balanced binary trees) – V.Leymarie Dec 09 '15 at 13:37
  • 16
    One example I can think of: for a small sorted array, it'd be easier and more compact for the programmer to implement a binary search function than to write a complete hash map implementation and use it instead. – Colonel Thirty Two Dec 09 '15 at 18:41
  • 5
    An example of complexity: finding the median of an unsorted list is easy to do in O(n * log n) but hard to do in O(n). – Paul Draper Dec 09 '15 at 23:38
  • @anatolyg but that would make it long and less to-the-point. see the other answers for examples... – djechlin Dec 10 '15 at 03:07
  • 1
    -1, don't put logs in your toaster... Kidding aside, this is spot on. `lg n` is so, so, so close to `k` for large `n` that most operations would never notice the difference. – corsiKa Dec 10 '15 at 17:47
  • 3
    There's also the fact that the algorithmic complexities most people are familiar with don't take cache effects into account. Looking up something in a binary tree is O(log2(n)) according to most people but in reality it's much worse because binary trees have bad locality. – Doval Dec 10 '15 at 18:09
  • @Doval, yes though that isn't asymptotic. – Paul Draper Dec 10 '15 at 18:55
  • @PaulDraper You're right, but since a cache miss is an order of magnitude slower than a cache hit (and there's multiple levels of cache), bad cache performance can make the constant factor very high. The point being that you can't just blindly compare the asymptotic behavior and ignore what happens for not-arbitrarily-huge sizes of `n`. – Doval Dec 10 '15 at 21:39
  • I think [Fibonacci heap](https://en.wikipedia.org/wiki/Fibonacci_heap#Summary_of_running_times) is great example of high constant. It has better O() for some operations than standard trees/heaps but actual run time is almost always worse. – user158037 Dec 11 '15 at 10:58
  • In particular, `log n` is effectively a constant 64 or smaller. – R.. GitHub STOP HELPING ICE Dec 11 '15 at 20:18
  • @ColonelThirtyTwo: Bad example. Binary search is O(log n). Hash map is O(n); it's only "typically" constant-time. – R.. GitHub STOP HELPING ICE Dec 11 '15 at 20:19
  • @R. Doesn't that depend on your collision strategy? A no-collision hash map has `O(1)`; resolving collisions via a binary search tree would be `O(log n)`. – Colonel Thirty Two Dec 11 '15 at 20:41
  • @ColonelThirtyTwo: No-collision is not possible to add to efficiently. – R.. GitHub STOP HELPING ICE Dec 11 '15 at 20:56
57

I'm surprised nobody has mentioned memory-bound applications yet.

There may be an algorithm that has less floating point operations either due to its complexity (i.e. O(1) < O(log n)) or because the constant in front of the complexity is smaller (i.e. 2n2 < 6n2). Regardless, you might still prefer the algorithm with more FLOP if the lower FLOP algorithm is more memory-bound.

What I mean by "memory-bound" is that you are often accessing data that is constantly out-of-cache. In order to fetch this data, you have to pull the memory from your actually memory space into your cache before you can perform your operation on it. This fetching step is often quite slow - much slower than your operation itself.

Therefore, if your algorithm requires more operations (yet these operations are performed on data that is already in cache [and therefore no fetching required]), it will still out-perform your algorithm with fewer operations (which must be performed on out-of-cache data [and therefore require a fetch]) in terms of actual wall-time.

Salvador Dali
  • 214,103
  • 147
  • 703
  • 753
NoseKnowsAll
  • 4,593
  • 2
  • 23
  • 44
  • 1
    Alistra addressed this indirectly when talking about "space concerns" – Zach Saucier Dec 10 '15 at 02:33
  • 2
    Huge amount of cache misses only multiplies the final execution by a constant value (which is not bigger than 8 for 4-core 3.2GHz CPU with 1.6GHz ram, usually it is much lower) so it is counted as a fixed constant in the big-O notation. Thus only thing the cache misses cause is moving the threshold of n where that O(n) solution starts to be slower than the O(1) solution. – Marian Spanik Dec 10 '15 at 12:09
  • 1
    @MarianSpanik You are of course correct. But this question asked for a situation where we would prefer `O(logn)` over `O(1)`. You could very easily imagine a situation where for all your feasible `n`, the less memory-bound application would run in faster wall-time, even at a higher complexity. – NoseKnowsAll Dec 10 '15 at 15:33
  • @MarianSpanik isn't a cache miss up to 300 clock cycles ? Where is the 8 coming from ? – HopefullyHelpful Feb 27 '17 at 12:07
44

In contexts where data security is a concern, a more complex algorithm may be preferable to a less complex algorithm if the more complex algorithm has better resistance to timing attacks.

Kevin K
  • 9,344
  • 3
  • 37
  • 62
  • 6
    While what you said is true, in that case, an algorithm executing in O(1) is by definition invulnerable to timing attacks. – Justin Lessard Dec 09 '15 at 19:50
  • 18
    @JustinLessard: Being O(1) means that there is some input size after which the algorithm's runtime is bounded by a constant. What happens below this threshold is unknown. Also, the threshold may not even be met for any real-world use of the algorithm. The algorithm might be linear and thus leak information about the length of the input, for example. – Jörg W Mittag Dec 09 '15 at 20:42
  • 12
    The runtime might also fluctuate in different ways, while still being bounded. If the runtime is proportional to `(n mod 5) + 1`, it is still `O(1)`, yet reveals information about `n`. So a more complex algorithm with smoother runtime may be preferable, even if it may be asymptotically (and possibly even in practice) slower. – Christian Semrau Dec 10 '15 at 12:36
  • This is basically why bcrypt is considered good; it makes things slower – David says Reinstate Monica Dec 10 '15 at 19:18
  • @DavidGrinberg That's the reason why bcrypt is used, and fits the question. But that is unrelated to the this answer, which talks about timing attacks. – Christian Semrau Dec 11 '15 at 07:51
  • Another vulnerability of some O(1) algorithms is that it may be doing a single or fixed number of table lookups. An attacker could carefully pollute the cache and time the target process, or identify which line of cache got loaded by what data of its own got evicted. – Phil Miller Dec 11 '15 at 22:07
  • Note the Spectre and Meltdown attacks. Both are timing attacks aimed at an O(1) routine. – Loren Pechtel Apr 10 '18 at 04:39
37

Alistra nailed it but failed to provide any examples so I will.

You have a list of 10,000 UPC codes for what your store sells. 10 digit UPC, integer for price (price in pennies) and 30 characters of description for the receipt.

O(log N) approach: You have a sorted list. 44 bytes if ASCII, 84 if Unicode. Alternately, treat the UPC as an int64 and you get 42 & 72 bytes. 10,000 records--in the highest case you're looking at a bit under a megabyte of storage.

O(1) approach: Don't store the UPC, instead you use it as an entry into the array. In the lowest case you're looking at almost a third of a terabyte of storage.

Which approach you use depends on your hardware. On most any reasonable modern configuration you're going to use the log N approach. I can picture the second approach being the right answer if for some reason you're running in an environment where RAM is critically short but you have plenty of mass storage. A third of a terabyte on a disk is no big deal, getting your data in one probe of the disk is worth something. The simple binary approach takes 13 on average. (Note, however, that by clustering your keys you can get this down to a guaranteed 3 reads and in practice you would cache the first one.)

Loren Pechtel
  • 8,945
  • 3
  • 33
  • 45
  • 2
    I'm a little confused here. Are you talking about creating a 10-billion-entry array (most of which will be undefined) and treating the UPC as an index into that array? – David Z Dec 10 '15 at 19:23
  • 7
    @DavidZ Yes. If you use a sparse array, you may not get O(1) but it will use only 1MB memory. If you use an actual array, you're guaranteed O(1) access but it will use 1/3 TB memory. – Navin Dec 10 '15 at 23:29
  • On a modern system, it will use 1/3 TB of address space, but that doesn't mean it will come anywhere close to that much allocated backing memory. Most modern OSes don't commit storage for allocations until they need to. When doing this, you're essentially hiding an associative lookup structure for your data inside the OS/hardware virtual memory system. – Phil Miller Dec 11 '15 at 22:09
  • @Novelocrat True, but if you're doing it at RAM speeds the lookup time won't matter, no reason to use 40mb instead of 1mb. The array version only makes sense when storage access is expensive--you're going off to disk. – Loren Pechtel Dec 11 '15 at 22:23
  • 1
    Or when this isn't a performance-critical operation, and developer time is expensive - saying `malloc(search_space_size)` and subscripting into what it returns is as easy as it gets. – Phil Miller Dec 11 '15 at 22:26
36

Consider a red-black tree. It has access, search, insert, and delete of O(log n). Compare to an array, which has access of O(1) and the rest of the operations are O(n).

So given an application where we insert, delete, or search more often than we access and a choice between only these two structures, we would prefer the red-black tree. In this case, you might say we prefer the red-black tree's more cumbersome O(log n) access time.

Why? Because the access is not our overriding concern. We are making a trade off: the performance of our application is more heavily influenced by factors other than this one. We allow this particular algorithm to suffer performance because we make large gains by optimizing other algorithms.

So the answer to your question is simply this: when the algorithm's growth rate isn't what we want to optimize, when we want to optimize something else. All of the other answers are special cases of this. Sometimes we optimize the run time of other operations. Sometimes we optimize for memory. Sometimes we optimize for security. Sometimes we optimize maintainability. Sometimes we optimize for development time. Even the overriding constant being low enough to matter is optimizing for run time when you know the growth rate of the algorithm isn't the greatest impact on run time. (If your data set was outside this range, you would optimize for the growth rate of the algorithm because it would eventually dominate the constant.) Everything has a cost, and in many cases, we trade the cost of a higher growth rate for the algorithm to optimize something else.

jpmc26
  • 28,463
  • 14
  • 94
  • 146
  • Not sure how operations that allows you to use array with O(1) lookup and updates O(n) correspond to red-black tree, people used to think about (at least me). Most of the time I would first think about key-based lookup for red-black tree. But to match with array it should be a bit different structure that keep amount of sub-nodes in upper nodes to provide index-based lookup and re-index on insertion. Though I agree that red-black can be used to maintain balance, you can use balanced tree if you want to be vague about details of corresponding operations. – ony Jul 27 '19 at 09:29
  • @ony A red-black tree can be used to define a map/dictionary type structure, but it need not be. The nodes can just be elements, essentially implementing a sorted list. – jpmc26 Jul 28 '19 at 05:42
  • sorted list and array that defines order of elements have different amount of information. One is based on order between elements and set and other defines arbitrary sequence that not necessary defines order between elements. Other thing is what is "access" and "search" that you declare to be `O(log n)` of "red-black tree"? Insert of `5` in position 2 of array `[1, 2, 1, 4]` will result in `[1, 2, 5, 1 4]` (element `4` will get index updated from 3 to 4). How you going to get this behavior in `O(log n)` in the "red-black tree" that you reference as "sorted list"? – ony Jul 28 '19 at 14:38
  • @ony "sorted list and array that defines order of elements have different amount of information." Yes, and that is part of why they have different performance characteristics. You're missing the point. One is not a drop in replacement for the other in all situations. They *optimize different things* and *make different trade offs*, and the point is that developers are making decisions about those trade offs constantly. – jpmc26 Jul 30 '19 at 06:20
  • @ony Access, search, insert, and delete have specific meanings in the context of algorithm performance. Access is fetching an element by position. Search is locating an element by value (which only has any practical application as a containment check for a non-map structure). Insert and delete should be straightforward, though. Example usage can be seen [here](https://www.bigocheatsheet.com/). – jpmc26 Jul 30 '19 at 06:26
  • @jpmc6, [wiki](https://en.wikipedia.org/wiki/Red%E2%80%93black_tree) have only 3 operations. Search tree doesn't have "position" information since path to node is defined by order between nodes. Thus if you need to store sequence with order preserved and access by position you cannot use search tree and there is no room for choice other than array. If you don't care about position and need to access by value characteristics then with choice among array `O(n)` (search) and search tree `O(log n)` choice is different than what you described. – ony Aug 01 '19 at 06:01
  • @ony That's exactly what I said. A tree *can* be used for position based access (although it won't preserve insertion order), but it is a poor choice for implementing that data structure. *That's* the point. You choose appropriate data structures for the problem at hand. – jpmc26 Aug 01 '19 at 15:59
  • and I'm saying there is no "position based access" in red-black search tree since there structure doesn't save user provided "position". "We allow this particular algorithm to suffer performance" - there is no performance suffer, there is information suffer. Would you please be kind point me to implementation or at least library with API of red-black search tree with both "access" and "search" operations, so I would at least understand what is input for such operations. Note that in that "cheatsheet" you linked hash-table have no access operation, but only search. Maybe you know why? – ony Aug 01 '19 at 20:29
23

Yes.

In a real case, we ran some tests on doing table lookups with both short and long string keys.

We used a std::map, a std::unordered_map with a hash that samples at most 10 times over the length of the string (our keys tend to be guid-like, so this is decent), and a hash that samples every character (in theory reduced collisions), an unsorted vector where we do a == compare, and (if I remember correctly) an unsorted vector where we also store a hash, first compare the hash, then compare the characters.

These algorithms range from O(1) (unordered_map) to O(n) (linear search).

For modest sized N, quite often the O(n) beat the O(1). We suspect this is because the node-based containers required our computer to jump around in memory more, while the linear-based containers did not.

O(lg n) exists between the two. I don't remember how it did.

The performance difference wasn't that large, and on larger data sets the hash-based one performed much better. So we stuck with the hash-based unordered map.

In practice, for reasonable sized n, O(lg n) is O(1). If your computer only has room for 4 billion entries in your table, then O(lg n) is bounded above by 32. (lg(2^32)=32) (in computer science, lg is short hand for log based 2).

In practice, lg(n) algorithms are slower than O(1) algorithms not because of the logarithmic growth factor, but because the lg(n) portion usually means there is a certain level of complexity to the algorithm, and that complexity adds a larger constant factor than any of the "growth" from the lg(n) term.

However, complex O(1) algorithms (like hash mapping) can easily have a similar or larger constant factor.

Yakk - Adam Nevraumont
  • 262,606
  • 27
  • 330
  • 524
21

The possibility to execute an algorithm in parallel.

I don't know if there is an example for the classes O(log n) and O(1), but for some problems, you choose an algorithm with a higher complexity class when the algorithm is easier to execute in parallel.

Some algorithms cannot be parallelized but have so low complexity class. Consider another algorithm which achieves the same result and can be parallelized easily, but has a higher complexity class. When executed on one machine, the second algorithm is slower, but when executed on multiple machines, the real execution time gets lower and lower while the first algorithm cannot speed up.

Andrea Dusza
  • 2,080
  • 3
  • 18
  • 28
Simulant
  • 19,190
  • 8
  • 63
  • 98
  • But all that parallelization does is reduce the constant factor that others have talked about, right? – gengkev Dec 15 '15 at 23:08
  • 1
    Yes, but a parallel algorithm can divide the constant factor by 2 every time you double the number of executing machines. Another single threaded algorithm can reduce the constant factor just one time in a constant way. So with a parallel algorithm you can dynamically react to the size of n and be faster in wall clock execution time. – Simulant Dec 16 '15 at 08:19
15

Let's say you're implementing a blacklist on an embedded system, where numbers between 0 and 1,000,000 might be blacklisted. That leaves you two possible options:

  1. Use a bitset of 1,000,000 bits
  2. Use a sorted array of the blacklisted integers and use a binary search to access them

Access to the bitset will have guaranteed constant access. In terms of time complexity, it is optimal. Both from a theoretical and from a practical point view (it is O(1) with an extremely low constant overhead).

Still, you might want to prefer the second solution. Especially if you expect the number of blacklisted integers to be very small, as it will be more memory efficient.

And even if you do not develop for an embedded system where memory is scarce, I just can increase the arbitrary limit of 1,000,000 to 1,000,000,000,000 and make the same argument. Then the bitset would require about 125G of memory. Having a guaranteed worst-case complexitity of O(1) might not convince your boss to provide you such a powerful server.

Here, I would strongly prefer a binary search (O(log n)) or binary tree (O(log n)) over the O(1) bitset. And probably, a hash table with its worst-case complexity of O(n) will beat all of them in practice.

Chris Moschini
  • 36,764
  • 19
  • 160
  • 190
Philipp Claßen
  • 41,306
  • 31
  • 146
  • 239
13

My answer here Fast random weighted selection across all rows of a stochastic matrix is an example where an algorithm with complexity O(m) is faster than one with complexity O(log(m)), when m is not too big.

Community
  • 1
  • 1
Warren Weckesser
  • 110,654
  • 19
  • 194
  • 214
12

A more general question is if there are situations where one would prefer an O(f(n)) algorithm to an O(g(n)) algorithm even though g(n) << f(n) as n tends to infinity. As others have already mentioned, the answer is clearly "yes" in the case where f(n) = log(n) and g(n) = 1. It is sometimes yes even in the case that f(n) is polynomial but g(n) is exponential. A famous and important example is that of the Simplex Algorithm for solving linear programming problems. In the 1970s it was shown to be O(2^n). Thus, its worse-case behavior is infeasible. But -- its average case behavior is extremely good, even for practical problems with tens of thousands of variables and constraints. In the 1980s, polynomial time algorithms (such a Karmarkar's interior-point algorithm) for linear programming were discovered, but 30 years later the simplex algorithm still seems to be the algorithm of choice (except for certain very large problems). This is for the obvious reason that average-case behavior is often more important than worse-case behavior, but also for a more subtle reason that the simplex algorithm is in some sense more informative (e.g. sensitivity information is easier to extract).

John Coleman
  • 51,337
  • 7
  • 54
  • 119
12

People have already answered your exact question, so I'll tackle a slightly different question that people may actually be thinking of when coming here.

A lot of the "O(1) time" algorithms and data structures actually only take expected O(1) time, meaning that their average running time is O(1), possibly only under certain assumptions.

Common examples: hashtables, expansion of "array lists" (a.k.a. dynamically sized arrays/vectors).

In such scenarios, you may prefer to use data structures or algorithms whose time is guaranteed to be absolutely bounded logarithmically, even though they may perform worse on average.
An example might therefore be a balanced binary search tree, whose running time is worse on average but better in the worst case.

user541686
  • 205,094
  • 128
  • 528
  • 886
10

To put my 2 cents in:

Sometimes a worse complexity algorithm is selected in place of a better one, when the algorithm runs on a certain hardware environment. Suppose our O(1) algorithm non-sequentially accesses every element of a very big, fixed-size array to solve our problem. Then put that array on a mechanical hard drive, or a magnetic tape.

In that case, the O(logn) algorithm (suppose it accesses disk sequentially), becomes more favourable.

uylmz
  • 1,480
  • 2
  • 23
  • 44
  • I might add here that on the sequential-access drive or tape, the O(1) algorithm instead becomes O(n), which is why the sequential solution becomes more favorable. Many O(1) operations depend on adding and indexed lookup being a constant-time algorithm, which it is not in a sequential-access space. – TheHans255 Dec 28 '15 at 14:47
9

There is a good use case for using a O(log(n)) algorithm instead of an O(1) algorithm that the numerous other answers have ignored: immutability. Hash maps have O(1) puts and gets, assuming good distribution of hash values, but they require mutable state. Immutable tree maps have O(log(n)) puts and gets, which is asymptotically slower. However, immutability can be valuable enough to make up for worse performance and in the case where multiple versions of the map need to be retained, immutability allows you to avoid having to copy the map, which is O(n), and therefore can improve performance.

Reinstate Monica
  • 2,420
  • 14
  • 23
9

Simply: Because the coefficient - the costs associated with setup, storage, and the execution time of that step - can be much, much larger with a smaller big-O problem than with a larger one. Big-O is only a measure of the algorithms scalability.

Consider the following example from the Hacker's Dictionary, proposing a sorting algorithm relying on the Multiple Worlds Interpretation of Quantum Mechanics:

  1. Permute the array randomly using a quantum process,
  2. If the array is not sorted, destroy the universe.
  3. All remaining universes are now sorted [including the one you are in].

(Source: http://catb.org/~esr/jargon/html/B/bogo-sort.html)

Notice that the big-O of this algorithm is O(n), which beats any known sorting algorithm to date on generic items. The coefficient of the linear step is also very low (since it's only a comparison, not a swap, that is done linearly). A similar algorithm could, in fact, be used to solve any problem in both NP and co-NP in polynomial time, since each possible solution (or possible proof that there is no solution) can be generated using the quantum process, then verified in polynomial time.

However, in most cases, we probably don't want to take the risk that Multiple Worlds might not be correct, not to mention that the act of implementing step 2 is still "left as an exercise for the reader".

TheHans255
  • 2,059
  • 1
  • 19
  • 36
7

At any point when n is bounded and the constant multiplier of O(1) algorithm is higher than the bound on log(n). For example, storing values in a hashset is O(1), but may require an expensive computation of a hash function. If the data items can be trivially compared (with respect to some order) and the bound on n is such that log n is significantly less than the hash computation on any one item, then storing in a balanced binary tree may be faster than storing in a hashset.

Dmitry Rubanovich
  • 2,471
  • 19
  • 27
6

In a realtime situation where you need a firm upper bound you would select e.g. a heapsort as opposed to a Quicksort, because heapsort's average behaviour is also its worst-case behaviour.

user207421
  • 305,947
  • 44
  • 307
  • 483
6

Adding to the already good answers.A practical example would be Hash indexes vs B-tree indexes in postgres database.

Hash indexes form a hash table index to access the data on the disk while btree as the name suggests uses a Btree data structure.

In Big-O time these are O(1) vs O(logN).

Hash indexes are presently discouraged in postgres since in a real life situation particularly in database systems, achieving hashing without collision is very hard(can lead to a O(N) worst case complexity) and because of this, it is even more harder to make them crash safe (called write ahead logging - WAL in postgres).

This tradeoff is made in this situation since O(logN) is good enough for indexes and implementing O(1) is pretty hard and the time difference would not really matter.

Madusudanan
  • 1,017
  • 1
  • 15
  • 36
4

When n is small, and O(1) is constantly slow.

HoboBen
  • 2,900
  • 4
  • 21
  • 26
3
  1. When the "1" work unit in O(1) is very high relative to the work unit in O(log n) and the expected set size is small-ish. For example, it's probably slower to compute Dictionary hash codes than iterate an array if there are only two or three items.

or

  1. When the memory or other non-time resource requirements in the O(1) algorithm are exceptionally large relative to the O(log n) algorithm.
Joel Coehoorn
  • 399,467
  • 113
  • 570
  • 794
3
  1. when redesigning a program, a procedure is found to be optimized with O(1) instead of O(lgN), but if it's not the bottleneck of this program, and it's hard to understand the O(1) alg. Then you would not have to use O(1) algorithm
  2. when O(1) needs much memory that you cannot supply, while the time of O(lgN) can be accepted.
yanghaogn
  • 833
  • 7
  • 15
2

This is often the case for security applications that we want to design problems whose algorithms are slow on purpose in order to stop someone from obtaining an answer to a problem too quickly.

Here are a couple of examples off the top of my head.

  • Password hashing is sometimes made arbitrarily slow in order to make it harder to guess passwords by brute-force. This Information Security post has a bullet point about it (and much more).
  • Bit Coin uses a controllably slow problem for a network of computers to solve in order to "mine" coins. This allows the currency to be mined at a controlled rate by the collective system.
  • Asymmetric ciphers (like RSA) are designed to make decryption without the keys intentionally slow in order to prevent someone else without the private key to crack the encryption. The algorithms are designed to be cracked in hopefully O(2^n) time where n is the bit-length of the key (this is brute force).

Elsewhere in CS, Quick Sort is O(n^2) in the worst case but in the general case is O(n*log(n)). For this reason, "Big O" analysis sometimes isn't the only thing you care about when analyzing algorithm efficiency.

Community
  • 1
  • 1
Frank Bryce
  • 8,076
  • 4
  • 38
  • 56
1

There are plenty of good answers, a few of which mention the constant factor, the input size and memory constraints, among many other reasons complexity is only a theoretical guideline rather than the end-all determination of real-world fitness for a given purpose or speed.

Here's a simple, concrete example to illustrate these ideas. Let's say we want to figure out whether an array has a duplicate element. The naive quadratic approach is to write a nested loop:

const hasDuplicate = arr => {
  for (let i = 0; i < arr.length; i++) {
    for (let j = i + 1; j < arr.length; j++) {
      if (arr[i] === arr[j]) {
        return true;
      }
    }
  }
  
  return false;
};

console.log(hasDuplicate([1, 2, 3, 4]));
console.log(hasDuplicate([1, 2, 4, 4]));

But this can be done in linear time by creating a set data structure (i.e. removing duplicates), then comparing its size to the length of the array:

const hasDuplicate = arr => new Set(arr).size !== arr.length;
console.log(hasDuplicate([1, 2, 3, 4]));
console.log(hasDuplicate([1, 2, 4, 4]));

Big O tells us is that the new Set approach will scale a great deal better from a time complexity standpoint.

However, it turns out that the "naive" quadratic approach has a lot going for it that Big O can't account for:

  • No additional memory usage
  • No heap memory allocation (no new)
  • No garbage collection for the temporary Set
  • Early bailout; in a case when the duplicate is known to be likely in the front of the array, there's no need to check more than a few elements.

If our use case is on bounded small arrays, we have a resource-constrained environment and/or other known common-case properties allow us to establish through benchmarks that the nested loop is faster on our particular workload, it might be a good idea.

On the other hand, maybe the set can be created one time up-front and used repeatedly, amortizing its overhead cost across all of the lookups.

This leads inevitably to maintainability/readability/elegance and other "soft" costs. In this case, the new Set() approach is probably more readable, but it's just as often (if not more often) that achieving the better complexity comes at great engineering cost.

Creating and maintaining a persistent, stateful Set structure can introduce bugs, memory/cache pressure, code complexity, and all other manner of design tradeoffs. Negotiating these tradeoffs optimally is a big part of software engineering, and time complexity is just one factor to help guide that process.


A few other examples that I don't see mentioned yet:

  • In real-time environments, for example resource-constrained embedded systems, sometimes complexity sacrifices are made (typically related to caches and memory or scheduling) to avoid incurring occasional worst-case penalties that can't be tolerated because they might cause jitter.
  • Also in embedded programming, the size of the code itself can cause cache pressure, impacting memory performance. If an algorithm has worse complexity but will result in massive code size savings, that might be a reason to choose it over an algorithm that's theoretically better.
  • In most implementations of recursive linearithmic algorithms like quicksort, when the array is small enough, a quadratic sorting algorithm like insertion sort is often called because the overhead of recursive function calls on increasingly tiny arrays tends to outweigh the cost of nested loops. Insertion sort is also fast on mostly-sorted arrays as the inner loop won't run much. This answer discusses this in an older version of Chrome's V8 engine before they moved to Timsort.
ggorlen
  • 44,755
  • 7
  • 76
  • 106