What I am sure of :
- I am working with Java/Eclipse on Linux and trying to store a very large number of key/value pairs of 16/32 bytes respectively on disk. Keys are fully random, generated with SecureRandom.
- The speed is constant at ~50000 inserts/sec until it reaches ~1 million entries.
- Once this limit is reached, the java process oscillates every 1-2 seconds from 0% CPU to 100%, from 150MB of memory to 400MB, and from 10 inserts/sec to 100.
- I tried with both Berkeley DB and Kyoto Cabinet and with both Btrees and Hashtables. Same results.
What might contribute :
- It's writing on SSD.
- For every insert there is on average 1.5 reads −alternating reads and writes constantly.
I suspect the nice 50000 rate is up until some cache/buffer limit is reached. Then the big slow down might be due to SSD not handling read/write mixed together, as suggested on this question : Low-latency Key-Value Store for SSD.
Question is :
Where might this extreme slow down be from ? It can't be all SSD's fault. Lots of people use happily SSD for high speed DB process, and I'm sure they mix read and write a lot.
Thanks.
Edit : I've made sure to remove any memory limit, and the java process has always room to allocate more memory.
Edit : Removing readings and doing inserts only does not change the problem.
Last Edit : For the record, for hash tables it seems related to the initial number buckets. On Kyoto cabinet that number cannot be changed and is defaulted to ~1 million, so better get the number right at creation time (1 to 4 times the maximum number of records to store). For BDB, it is designed to grow progressively the number of buckets, but as it is ressource consuming, better predefine the number in advance.