160

In MySQL, an index type is a b-tree, and access an element in a b-tree is in logarithmic amortized time O(log(n)).

On the other hand, accessing an element in a hash table is in O(1).

Why is a hash table not used instead of a b-tree in order to access data inside a database?

Sisir
  • 4,584
  • 4
  • 26
  • 37
JohnJohnGa
  • 15,446
  • 19
  • 62
  • 87

8 Answers8

177

You can only access elements by their primary key in a hashtable. This is faster than with a tree algorithm (O(1) instead of log(n)), but you cannot select ranges (everything in between x and y). Tree algorithms support this in Log(n) whereas hash indexes can result in a full table scan O(n). Also the constant overhead of hash indexes is usually bigger (which is no factor in theta notation, but it still exists). Also tree algorithms are usually easier to maintain, grow with data, scale, etc.

Hash indexes work with pre-defined hash sizes, so you end up with some "buckets" where the objects are stored in. These objects are looped over again to really find the right one inside this partition.

So if you have small sizes you have a lot of overhead for small elements, big sizes result in further scanning.

Todays hash tables algorithms usually scale, but scaling can be inefficient.

There are indeed scalable hashing algorithms. Don't ask me how that works - its a mystery to me too. AFAIK they evolved from scalable replication where re-hashing is not easy.

Its called RUSH - Replication Under Scalable Hashing, and those algorithms are thus called RUSH algorithms.

However there may be a point where your index exceeds a tolerable size compared to your hash sizes and your entire index needs to be re-built. Usually this is not a problem, but for huge-huge-huge databases, this can take days.

The trade off for tree algorithms is small and they are suitable for almost every use case and thus are default.

However if you have a very precise use case and you know exactly what and only what is going to be needed, you can take advantage of hashing indexes.

Community
  • 1
  • 1
The Surrican
  • 29,118
  • 24
  • 122
  • 168
  • Can you explain more on the index rebuilding? Does it mean that for x days while the index rebuilds, the table is totally unavailable for use during that period? – Pacerier Jul 05 '12 at 23:11
  • that depends on the database system in use. the question only covered the theoretical aspecsts. i do not really know about the implementation details of common database systems. but usually this should not be the case because the second index can be built while the first is still being used – The Surrican Jan 25 '14 at 01:31
  • "You can only access elements by their primary key" - you mean by the value of the column that has the index right, whether it's a primary key or other type of index? – Mark Fisher Jul 12 '18 at 16:37
  • What do you think about LSM-Trees? They use a SSTables (Sorted String Tables), which are segments (files) of data sorted by key (thanks to an in-memory memtable, which is essentially an AVL tree emptied and written periodically to disk when a threshold of data is reached - typically a few MB) and use in-memory hash maps to efficiently retrieve data in segments. This kind of indexing of data also allows efficient range queries, as far as I understand. – tonix Aug 12 '21 at 23:12
162

Actually, it seems that MySQL uses both kind of indexes either a hash table or a b-tree according to the following link.

The difference between using a b-tree and a hash table is that the former allows you to use column comparisons in expressions that use the =, >, >=, <, <=, or BETWEEN operators, while the latter is used only for equality comparisons that use the = or <=> operators.

DowntownDev
  • 842
  • 10
  • 15
lmiguelvargasf
  • 63,191
  • 45
  • 217
  • 228
  • 14
    That's unfair. The best answer has the lowest score. – Андрей Беньковский Dec 04 '16 at 13:13
  • 7
    This is exactly what I was looking for. I cared about how it affects my queries rather than a technical analysis. – Ben Dehghan Mar 10 '17 at 20:01
  • Yep! This answer helped me the most. – Ron Ross Jul 31 '18 at 04:17
  • thanks a lot, been long time but this answer help me a lot as well. – Reham Fahmy Sep 22 '18 at 09:20
  • The only answer that makes sense, you can always implement a list in hash table keys, the overhead is no different from b-trees, its just that b-trees don't have a choice in the matter. Also there is no need to ever rebuild a hash table on the fly, you can just make more of them (adding to the total seek time bit by bit) and rebuild offline. The main consideration here is that hash tables take more planning ahead but IMO achieve superior results if enough thought is put in them. – user81993 Jun 12 '21 at 15:53
  • Continuing the trend of "how does this affect me?": The [referenced link](https://dev.mysql.com/doc/refman/5.5/en/index-btree-hash.html) includes this crucial difference between b-tree and hash table indices: "Only whole keys can be used to search for a row. (With a B-tree index, any leftmost prefix of the key can be used to find rows.)" – rinogo Oct 22 '21 at 21:24
17

The time complexity of hashtables is constant only for sufficiently sized hashtables (there need to be enough buckets to hold the data). The size of a database table is not known in advance so the table must be rehashed now and then to get optimal performance out of an hashtable. The rehashing is also expensive.

Emil Vikström
  • 90,431
  • 16
  • 141
  • 175
  • 2
    Can reshashing be performed while db is online? Or do we have to lock the table to rehash everything? – Pacerier Jul 05 '12 at 23:12
  • 1
    Pacerier, MySQL have no support for hash indices. It is theoretically possible to rehash the index while the database is still online (keep using the old index, create a new index, switch over to the new one when it is done) but I don't know what MySQL would do if they implemented hash indicies. – Emil Vikström Jul 05 '12 at 23:16
  • 3
    MySQL supports hash indexes right? : http://dev.mysql.com/doc/refman/5.5/en/index-btree-hash.html – Pacerier Jul 05 '12 at 23:18
  • You seem to be correct. That was news to me! I must try to keep up with the development :-) Then you are far better off in answering your question than I am, but as I said: it's theoretically possible. – Emil Vikström Jul 05 '12 at 23:23
  • Btw, why do you say that "a btree can be easily paged out to disk but a hashtable cannot"? Couldn't a hashtable be stored in disk since a simple key lookup would suffice? – Pacerier Feb 01 '15 at 16:35
  • 1
    You are right. My answer is actually wrong. If I answered this today I would say something like in [this answer for MongoDB](http://stackoverflow.com/a/13240079/238978), where I motivate why a b-tree has effectively O(1) lookup in practice. – Emil Vikström Feb 04 '15 at 14:24
  • 1
    @EmilVikström - The argument for MongoDB applies to MySQL, but uses about `log_100`. (A Rule of Thumb for InnoDB's fanout is 100; a billion rows would need 5 levels.) – Rick James Oct 26 '21 at 18:02
8
  • MySQL supports HASH in only a couple of situations: ENGINE=MEMORY (which is rarely used) and internally for a "hash-join".
  • Even when you ask an InnoDB table to have a HASH index, it silently turns it into BTree.
  • Hash comes close to O(1), but technically it is more like O(N^2) in the worst case. This is because of the need to handling "collisions".
  • MySQL picked BTree because it is more flexible than Hash (because it can handle ranges), while not being significantly slower than Hash.
  • Arguably, BTree is slower to O(1) due to caching of blocks. Non-leaf nodes tend to be cached and stay in RAM, even if the leaf nodes come and go (for large tables).
  • MySQL maintains a BTree dynamically; while you can ask to rebuild an index (cf OPTIMIZE), it is rarely worth the effort.
  • In InnoDB. The data is stored in a BTree ordered by the PRIMARY KEY. Secondary keys are also stored in separate BTrees, but ordered by the secondary key column(s). The only other info in a leaf node is the PRIMARY KEY value. Hence, a secondary key lookup needs two BTree lookups (unless all then necessary columns are in the secondary+primary columns -- this is called "covering").

I conclude by saying Big-O may be interesting, but the details of the implementation add complexity. And performance for arbitrarily large tables.

Rick James
  • 135,179
  • 13
  • 127
  • 222
8

I think Hashmaps don't scale as well, and can be expensive when the entire map needs to be rehashed.

6

In addition to the nice answers here, here is some perspective when thinking about how to build a database.

First, robust hash tables are typically done using a bucketing system, such as in Quadratic Probing which is used to implement JavaScript "objects" (i.e. hash tables), for example. You can see a bucketed hash table implementation in JavaScript here.

You'll notice in this implementation, that there is a lot more processing that goes on than meets the eye with O(1) notation. First, you run it through the hashing function, which iterates the length of the input string, and has 5+ computational steps each iteration. Note though, these are fast computational steps because they are all done in registers and not in RAM. Next, you use that hash value to fetch a bucket. I'm not sure how many buckets there are, or how long a bucket is, but the bucket is an array or linked list. So then you iterate through the bucket items, and compare every item with the input key you are fetching the value for. This is again a string comparison. So in all likelihood I would estimate that there are at least 100 computational steps for even a simple string to fetch it from a hash table. All of these string comparisons add up.

In addition, the buckets might be half empty, which takes up a lot of useless space. Finally, when the hash table reaches a certain size in occupancy, it has to then double in size! It has to re-process and recompute everything. This can cause a noticeable glitch in a UI application.

B+trees, on the other hand, are a more compact data structure. You are still doing string comparison, but you are only jumping MAX I would say 20 links in the tree (in terms of depth), then scanning the children in the last tree node to find the exact match.

In this sense, I think in reality that B+trees or B-trees will perform on par with hash tables, especially naive implementations. Both systems can be optimized and fine-tuned, and I still think they will be close to equal. Only testing will tell. But trees come with the advantage of being more compact memory-wise. So after thinking about this for long periods of time and weighing every aspect of the equation, I am going to choose B+trees as the ideal solution to finding items by key quickly.

Lance
  • 75,200
  • 93
  • 289
  • 503
0

Pick DB/OS was based on hashing and worked well. With more memory these days to support efficient sparse hash tables, and redundant hashing to support modest range queries, I would say hashing may yet have its place (some would rather have other forms of non-range similarity-matching, such as wildcards and regexps). We also recommend copying to keep collision chains contiguous when memory hierarchies have big speed differences.

0

Another thing that may impact the choice as well: Hash-tables work well for mapping a key to exactly one single value. However, in a situation where one key maps to a large number of elements (very common for single columns of a table), you can easily lose the O(1) behaviour depending on exactly how it handles it. BTrees do not have that problem and handle lots of duplicate entries excellently.

saolof
  • 1,097
  • 1
  • 15
  • 12
  • It is nearly impossible to make a Hash function that always maps to completely distinct values. Hashing for indexing purposes does not worry about that. That is, a few collisions are likely in any Hash implementation. Hence "_usually_ O(1)". – Rick James Oct 26 '21 at 17:15
  • InnoDB's `PRIMARY KEY` BTree necessarily has no duplicates (the PK is unique). Secondary indexes implicitly include the PK, hence they have no dups either. – Rick James Oct 26 '21 at 17:19