First, bsddb (or under it's new name Oracle BerkeleyDB) is not deprecated.
From experience LevelDB / RocksDB / bsddb are slower than wiredtiger, that's why I recommend wiredtiger.
wiredtiger is the storage engine for mongodb so it's well tested in production. There is little or no use of wiredtiger in Python outside my AjguDB project; I use wiredtiger (via AjguDB) to store and query wikidata and concept which around 80GB.
Here is an example class that allows mimick the python2 shelve module. Basically,
it's a wiredtiger backend dictionary where keys can only be strings:
import json
from wiredtiger import wiredtiger_open
WT_NOT_FOUND = -31803
class WTDict:
"""Create a wiredtiger backed dictionary"""
def __init__(self, path, config='create'):
self._cnx = wiredtiger_open(path, config)
self._session = self._cnx.open_session()
# define key value table
self._session.create('table:keyvalue', 'key_format=S,value_format=S')
self._keyvalue = self._session.open_cursor('table:keyvalue')
def __enter__(self):
return self
def close(self):
self._cnx.close()
def __exit__(self, *args, **kwargs):
self.close()
def _loads(self, value):
return json.loads(value)
def _dumps(self, value):
return json.dumps(value)
def __getitem__(self, key):
self._session.begin_transaction()
self._keyvalue.set_key(key)
if self._keyvalue.search() == WT_NOT_FOUND:
raise KeyError()
out = self._loads(self._keyvalue.get_value())
self._session.commit_transaction()
return out
def __setitem__(self, key, value):
self._session.begin_transaction()
self._keyvalue.set_key(key)
self._keyvalue.set_value(self._dumps(value))
self._keyvalue.insert()
self._session.commit_transaction()
Here the adapted test program from @saaj answer:
#!/usr/bin/env python3
import os
import random
import lipsum
from wtdict import WTDict
def main():
with WTDict('wt') as wt:
for _ in range(100000):
v = lipsum.generate_paragraphs(2)[0:random.randint(200, 1000)]
wt[os.urandom(10)] = v
if __name__ == '__main__':
main()
Using the following command line:
python test-wtdict.py & psrecord --plot=plot.png --interval=0.1 $!
I generated the following diagram:

$ du -h wt
60M wt
When write-ahead-log is active:

$ du -h wt
260M wt
This is without performance tunning and compression.
Wiredtiger has no known limit until recently, the documentation was updated to the following:
WiredTiger supports petabyte tables, records up to 4GB, and record numbers up to 64-bits.
http://source.wiredtiger.com/1.6.4/architecture.html