Costs formula
Nothing:
- insert: O(1)
- search O(n)
- cost search = n db lookup operations
- cost insert = db insert
a) B+ tree of order b
- insert: O (log n)
- search: O (log n)
- cost search = log_b n lookup operations + db lookup
- cost insert = log_b n lookup operations + db insert
- with increasing order, the number of lookup operations go down, but the cost per lookup operation increases
b) Hashtable
- insert: O(1)
- search: O(1)
- cost search = hash calculation [+ extra buckets when collisions happen] + db lookup
- cost insert = hash calculation [+ extra buckets when collisions happen] + db insert
Cost at n = 1000
Nothing:
- cost search = 1000 db lookup operations
- cost insert = 1 db insert
a1) B+ tree of order 2
- cost search = 10 lookup operations + 1 db lookup
- cost insert = 10 lookup operations + 1 db insert
a2) B+ tree of order 10
- cost search = 3 lookup operations + 1 db lookup
- cost insert = 3 lookup operations + 1 db insert
b) Hashtable
- cost search = hash calculation [+ extra buckets when collisions happen] + db lookup
- cost insert = hash calculation [+ extra buckets when collisions happen] + db insert
Cost at n = 1000000
Nothing:
- cost search = 1000000 db lookup operations
- cost insert = 1 db insert
a1) B+ tree of order 2
- cost search = 20 lookup operations + 1 db lookup
- cost insert = 20 lookup operations + 1 db insert
a2) B+ tree of order 10
- cost search = 6 lookup operations + 1 db lookup
- cost insert = 6 lookup operations + 1 db insert
b) Hashtable
- cost search = hash calculation [+ extra buckets when collisions happen] + db lookup
- cost insert = hash calculation [+ extra buckets when collisions happen] + db insert
There are also quite large costs gains to be gotten by hitting the hardware caches for subsequent hits. The exact gains depend on the hardware your database is installed on.
The costs are not all easily comparable. The hash calculation is generally more expensive than the lookups, but as n gets large it stays the same. B tree lookup and sequential DB lookups are likely to hit hardware caches (due to entire blocks being loaded).
The larger a table, the more important it is to have an index (see cost at n=1000 vs n=1000000). So the number of writes vs reads will vary with the size of your table.
You should also take your specific queries into account. For example, hash tables are not ordered while B trees are. So if a query needs to pick up all values between a minimum and a maximum value, a B tree would perform better than a hash table (a good hash is uniformly distributed).
In general, you would have to measure performance of the queries and inserts you would be using in practice. You could start without an index and then add one when you need it. Indexes can be added later without having to change the queries of the programs using your DB (the response time of the queries will change though, reads will get faster if they use the index, writes will get slower).
If you are in the specific case where you have a load operation with a lot of inserts followed by a period of reads, you can temporarily turn off the index and recalculate it after the load.
References:
B trees vs hash tables:
https://cs.stackexchange.com/questions/270/hash-tables-versus-binary-trees
Cache
http://www.moreprocess.com/devices/computer-memory-hierarchy-internal-register-cache-ram-hard-disk-magnetic-tape