In-memory databases usually do not support memory paging option (for the whole database or certain tables), i,e, total size of the database should be smaller than the available physical memory or maximum shared memory size.
Depending on your application, data-access pattern, size of database and available system memory for database, you have a few choices:
a. Pickled Python Data in File System
It stores structured Python data structure (such as list of dictionaries/lists/tuples/sets, dictionary of lists/pandas dataframes/numpy series, etc.) in pickled format so that they could be used immediately and convienently upon unpickled. AFAIK, Python does not use file system as backing store for Python objects in memory implicitly but host operating system may swap out Python processes for higher priority processes. This is suitable for static data, having smaller memory size compared to available system memory. These pickled data could be copied to other computers, read by multiple dependent or independent processes in the same computer. The actual database file or memory size has higher overhead than size of the data. It is the fastest way to access the data as the data is in the same memory of the Python process, and without a query parsing step.
b. In-memory Database
It stores dynamic or static data in the memory. Possible in-memory libraries that with Python API binding are Redis, sqlite3, Berkeley Database, rqlite, etc. Different in-memory databases offer different features
- Database may be locked in the physical memory so that it is not swapped to memory backing store by the host operating system. However the actual implementation for the same libray may vary across different operating systems.
- The database may be served by a database server process.
- The in-memory may be accessed by multiple dependent or independent processes.
- Support full, partial or no ACID model.
- In-memory database could be persistent to physical files so that it is available when the host operating is restarted.
- Support snapshots or/and different database copies for backup or database management.
- Support distributed database using master-slave, cluster models.
- Support from simple key-value lookup to advanced query, filter, group functions (such as SQL, NoSQL)
c. Memory-map Database/Data Structure
It stores static or dynamic data which could be larger than physical memory of the host operating system. Python developers could use API such as mmap.mmap()
numpy.memmap()
to map certain files into process memory space. The files could be arranged into index and data so that data could be lookup/accessed via index lookup. This is actually the mechanism used by various database libraries. Python developers could implement custom techniques to access/update data efficiency.