8

I have a 100 mega bytes sqlite db file that I would like to load to memory before performing sql queries. Is it possible to do that in python?

Thanks

Adam Matan
  • 128,757
  • 147
  • 397
  • 562
relima
  • 3,462
  • 5
  • 34
  • 53
  • 3
    That's what happens before too long -- it all winds up in memory. The only way to have an "all-in-memory" database is to open a database named ":memory:" create and load the tables from external sources. What problem are you trying to solve? Is it too slow? How do you know it's the database and not your code? – S.Lott Sep 29 '10 at 23:16
  • How do I load the tables from an external db to a memory db? – relima Sep 30 '10 at 21:59

4 Answers4

12

apsw is an alternate wrapper for sqlite, which enables you to backup an on-disk database to memory before doing operations.

From the docs:

###
### Backup to memory
###

# We will copy the disk database into a memory database

memcon=apsw.Connection(":memory:")

# Copy into memory
with memcon.backup("main", connection, "main") as backup:
    backup.step() # copy whole database in one go

# There will be no disk accesses for this query
for row in memcon.cursor().execute("select * from s"):
    pass

connection above is your on-disk db.

Lenna
  • 1,445
  • 9
  • 21
Ryan Ginstrom
  • 13,915
  • 5
  • 45
  • 60
  • I like your solution but there is only one problem, I use a lot of row_factory feature of pysqlite; and it seems that apsw does not have this feature. – relima Sep 30 '10 at 21:57
  • This has really solved my problem. My queries are MUCH faster now. – relima Oct 01 '10 at 19:23
  • import apsw mem_db_loader=apsw.Connection(file_sqlite_db) connection=apsw.Connection(":memory:") connection.backup("main", mem_db_loader, "main").step() cursor = connection.cursor() – relima Oct 01 '10 at 19:23
3
  1. Get an in-memory database running (standard stuff)
  2. Attach the disk database (file).
  3. Recreate tables / indexes and copy over contents.
  4. Detach the disk database (file)

Here's an example (taken from here) in Tcl (could be useful for getting the general idea along):

proc loadDB {dbhandle filename} {

    if {$filename != ""} {
        #attach persistent DB to target DB
        $dbhandle eval "ATTACH DATABASE '$filename' AS loadfrom"
        #copy each table to the target DB
        foreach {tablename} [$dbhandle eval "SELECT name FROM loadfrom.sqlite_master WHERE type = 'table'"] {
            $dbhandle eval "CREATE TABLE '$tablename' AS SELECT * FROM loadfrom.'$tablename'"
        }
        #create indizes in loaded table
        foreach {sql_exp} [$dbhandle eval "SELECT sql FROM loadfrom.sqlite_master WHERE type = 'index'"] {
            $dbhandle eval $sql_exp
        }
        #detach the source DB
        $dbhandle eval {DETACH loadfrom}
    }
}
ChristopheD
  • 112,638
  • 29
  • 165
  • 179
2

If you are using Linux, you can try tmpfs which is a memory-based file system.

It's very easy to use it:

  1. mount tmpfs to a directory.
  2. copy sqlite db file to the directory.
  3. open it as normal sqlite db file.

Remember, anything in tmpfs will be lost after reboot. So, you may copy db file back to disk if it changed.

animalize
  • 341
  • 3
  • 3
1

Note that you may not need to explicitly load the database into SQLite's memory at all. Simply prime your operating system disk cache by copying it to null.

Windows: copy file.db nul:
Unix/Mac:  cp file.db /dev/null

This has the advantage of the operating system taking care of memory management, especially discarding it if something more important comes along.

Roger Binns
  • 3,203
  • 1
  • 24
  • 33
  • It may be only my computer, but this technique didn't really improve my performance. (Win 7 x64, 8gb ram). – relima Oct 05 '10 at 22:39
  • It has worked for many other people on the SQLite mailing list in the past especially after a machine has just booted as it primes the file system cache. In your case it is most likely that file didn't end up in the file system cache. (Some copy tools tell the OS to bypass the cache so that they don't throw out existing "good" content in it.) – Roger Binns Oct 06 '10 at 20:34
  • 1
    The "nul:" trick didn't work for me on Win7, but a real copy (to temp.db) does. It's a little annoying b/c I have to delete the temp file to prevent taking excessive space on the HD, but it gets the file into the disk cache (makes the 1st query just as fast as subsequent queries). – Peter Rust Nov 14 '11 at 21:56
  • when anyway in a programming language (Python), you could just dummy-read the whole file before doing work. Any experience how this cache priming performs vs the `:memory:` backup method? – kxr May 19 '17 at 19:21