4

I am currently taking a university course in data structures, and this topic has been bothering me for a while now (this is not a homework assignment, just a purely theoretical question).

Let's assume you want to implement a dictionary. The dictionary should, of course, have a search function, accepting a key and returning a value.

Right now, I can only imagine 2 very general methods of implementing such a thing:

  • Using some kind of search tree, which would (always?) give an O(log n) worst case running time for finding the value by the key, or,
  • Hashing the key, which essentially returns a natural number which corresponds to an index in an array of values, giving an O(1) worst case running time.

Is O(1) worst case running time possible for a search function, without the use of arrays?

Is random access available only through the use of arrays?
Is it possible through the use of a pointer-based data structure (such as linked lists, search trees, etc.)?

Is it possible when making some specific assumptions, for example, the keys being in some order?

In other words, can you think of an implementation (if one is possible) for the search function and the dictionary that will receive any key in the dictionary and return its value in O(1) time, without using arrays for random access?

GeReV
  • 3,195
  • 7
  • 32
  • 44
  • 1
    A hashtable is only O(1) in the *worst case* if you have perfect hashing. In practice, a hashtable is O(1) in the *average case* and usually O(n) in the worst case. – Daniel Pryden Apr 15 '11 at 02:11
  • Interesting insight. I am not very familiar with how hashtables are implemented, aren't they generally implemented like the second method I've suggested? – GeReV Apr 15 '11 at 02:14
  • 2
    Yes, but in real-world use cases you need to take hash collisions into account. [Wikipedia](http://en.wikipedia.org/wiki/Hash_table#Collision_resolution) has a simple overview to get you started. – Daniel Pryden Apr 15 '11 at 02:16
  • Accounting for hash collisions means that each table entry will potentially hold multiple values, so it has to have an array. Depending on how strictly you interpret the "without arrays" requirement, that could mean that a hash table solution wouldn't be allowed for this question. – Sherm Pendley Apr 15 '11 at 02:48
  • @Sherm: The table entries could have linked lists instead, so to be more specific, "without arrays" means without arrays that could be used like the second method in the question - preventing random access with arrays. – GeReV Apr 15 '11 at 02:54

3 Answers3

3

It is not the use of an array that makes the lookup O(1), it's the fact that the lookup time is not dependent upon the size of the data storage. Hence any method that accesses data directly, without a search proportional in some way to the data sotrage size, would be O(1).

gsamaras
  • 71,951
  • 46
  • 188
  • 305
ThomasMcLeod
  • 7,603
  • 4
  • 42
  • 80
  • This is true, for example, when accessing the i-th item in an array, or finding the minimum/maximum of some heaps/trees. I was wondering if it's possible for collections containing N elements. – GeReV Apr 15 '11 at 02:21
  • @GeReV, I'm not quite sure what you're asking. An array has N elements. – ThomasMcLeod Apr 15 '11 at 02:39
  • I'm asking if there are, for example, some implementation of a search tree, a heap, etc. which doesn't use arrays as the underlying implementation but still retains O(1) search running times. – GeReV Apr 15 '11 at 02:49
  • @GeReV, a tree by its nature has a root and branches that, depending upon how the tree is balanced, generally expand in size exponentially from the root. Hence, if we access a tree through the root, we will always have θ(log n) average run time. This is simply because the number of nodes visted is proportional to log n. So the answer is no, there are no O(1) tree implementations. – ThomasMcLeod Apr 15 '11 at 04:13
  • I'll try to explain my question a bit differently. The question is not specifically about trees, vectors, and so on. I am claiming that the only way to access a random item in memory is only possible through arrays (that is, a predictable memory address), and asking if that claim is true. If it's not true, could there possibly be some elaborate data structure that can access any of its elements without being dependent on the amount of elements, without translating the key into a memory address, like hashtables do? – GeReV Apr 15 '11 at 15:09
  • @GeReV. First, we are talking about a "search" algorithm running in O(1) time. This is a bit of an anomaly, since, if you have a way to access an element in O(1), then you are not really searching, you are simply accessing. Second, think carefully about what an array is. It's an set of elements ordered by some id in a continguous block of memory. The id transforms linearly to a memory address. However, in a hash table the key does not transform linearly to a memory address. Further, it is not necessarily contiguous. So a hash table has O(1) access w/o using an array per se. – ThomasMcLeod Apr 16 '11 at 03:02
  • A hash table has an indexing step, so it has to use an array. – Mike Dunlavey Apr 17 '11 at 00:47
  • @ThomasMcLeod, an array is not by definition ordered and doesn't have to have any id. – tster Apr 17 '11 at 00:56
  • @tster, are you saying that array indexes are unordered? Of course they are ordered. @Mike, not so. The hash can be any kind of id, not necessarily an array index. – ThomasMcLeod Apr 17 '11 at 01:43
  • @ThomasMcLeod, you said `an set of elements ordered by some id`, not "ordered by array index" – tster Apr 17 '11 at 03:41
  • @tster, I could have been clearer. I did say that the id transforms linearly to a memory address. By inference that's the array index. – ThomasMcLeod Apr 17 '11 at 04:03
  • Computer memory is an array. When you hash by some id, you have to transform it into an address, so you're indexing. If the computer doesn't have memory that can be addressed in one step, like a Turing machine, hash search is no longer O(1). – Mike Dunlavey Apr 17 '11 at 13:49
  • @Mike, I think now you're getting away from the original question. Since OP said "without using arrays for random access," I believe he was referring to array abstract data types, as in C arrays or C++ vectors. However, if we expand the definition of "array" to include all types of random access memory, then I agree that O(1) search is not possible. – ThomasMcLeod Apr 17 '11 at 17:26
3

Here's another answer I made on that general subject. Essentially, algorithms reach their results by processing a certain number of bits of information. The length of time they take depends on how quickly they can do that.

A decision point having only 2 branches cannot process more than 1 bit of information. However, a decision point having n branches can process up to log(n) bits (base 2).

The only mechanism I'm aware of, in computers, that can process more than 1 bit of information, in a single operation, is indexing, whether it is indexing an array or executing a jump table (which is indexing an array).

Community
  • 1
  • 1
Mike Dunlavey
  • 40,059
  • 14
  • 91
  • 135
1

you could have a hash implemented with a trie tree. The complexity is O(max(length(string))), if you have strings of limited size, then you could say it runs in O(1), it doesn't depend on the number of strings you have in the structure. http://en.wikipedia.org/wiki/Trie

titus
  • 5,512
  • 7
  • 25
  • 39
  • Yes but the minimum key size is still proportional to log n. So as n increases, so does the key. Hence this is still a O(log n) structure. – ThomasMcLeod Apr 15 '11 at 04:31
  • @ThomasMcLeod, While you are right in theory, this is basically a B-Tree just like a database. With a small amount of hand-waving you can consider it linear time lookup. For instance, if we allow the "string" to have the entire range of values for a byte, then with a length of 5 you have over a trillion possible values. While it's technically O(log n), in reality n is so small you can consider it constant. – tster Apr 17 '11 at 00:52