9

What is the most elegant method for dumping a list in python into an sqlite3 DB as binary data (i.e., a BLOB cell)?

data = [ 0, 1, 2, 3, 4, 5 ]
# now write this to db as binary data
# 0000 0000
# 0000 0001
# ...
# 0000 0101
Noah
  • 15,080
  • 13
  • 104
  • 148
Gilad Naor
  • 20,752
  • 14
  • 46
  • 53

5 Answers5

22

It seems that Brian's solution fits your needs, but keep in mind that with that method your just storing the data as a string.

If you want to store the raw binary data into the database (so it doesn't take up as much space), convert your data to a Binary sqlite object and then add it to your database.

query = u'''insert into testtable VALUES(?)'''
b = sqlite3.Binary(some_binarydata)
cur.execute(query,(b,))
con.commit()

(For some reason this doesn't seem to be documented in the python documentation)

Here are some notes on sqlite BLOB data restrictions:

http://effbot.org/zone/sqlite-blob.htm

monkut
  • 42,176
  • 24
  • 124
  • 155
9

Assuming you want it treated as a sequence of 8-bit unsigned values, use the array module.

a = array.array('B', data)
>>> a.tostring()
'\x00\x01\x02\x03\x04\x05'

Use different typecodes than 'B' if you want to treat the data as different types. eg. 'b' for a sequence of signed bytes, or 'i' for a signed integer.

Brian
  • 116,865
  • 28
  • 107
  • 112
4

I have the same problem, and I'm thinking about solving this in another way.

I think the pickle module is done exactly for something like this (serialization on python objects)

Example (this one is for dumping to file... but I think it's easily changeble for db storage)

Saving:

# Save a dictionary into a pickle file.
import pickle
favorite_color = { "lion": "yellow", "kitty": "red" }
pickle.dump( favorite_color, open( "save.p", "w" ) )

Loading:

# Load the dictionary back from the pickle file.
import pickle
favorite_color = pickle.load( open( "save.p" ) )

IMHO I think this way is more elegant and safer(it works for any python object).

That's my 2 cents

UPDATE: After doing a bit of search on my idea, they show some gotchas on my solution ( I can't make sql searches on that field). But I still think that it's a decent solution (if you don't need to search that field.

Community
  • 1
  • 1
Diego Castro
  • 3,458
  • 4
  • 35
  • 42
  • 2
    Note that the `pickle` module is insecure and can be incorrect. It's usually better to use a more safe, standardized, serialization protocol like JSON. – Mike Graham Mar 04 '11 at 13:46
2

See this general solution at SourceForge which covers any arbitrary Python object (including list, tuple, dictionary, etc):

y_serial.py module :: warehouse Python objects with SQLite

"Serialization + persistance :: in a few lines of code, compress and annotate Python objects into SQLite; then later retrieve them chronologically by keywords without any SQL. Most useful "standard" module for a database to store schema-less data."

http://yserial.sourceforge.net

  • Although not strictly a good answer to the OPs question, I must say that I followed your link to yserial and the module is highly appropriate to any project that needs to store and retrieve Python objects in sqlite. Thank you! – venzen Feb 20 '13 at 10:27
1

It is possible to store object data as pickle dump, jason etc but it is also possible to index, them, restrict them and run select queries that use those indices. Here is example with tuples, that can be easily applied for any other python class. All that is needed is explained in python sqlite3 documentation (somebody already posted the link). Anyway here it is all put together in the following example:

import sqlite3
import pickle

def adapt_tuple(tuple):
    return pickle.dumps(tuple)    

sqlite3.register_adapter(tuple, adapt_tuple)    #cannot use pickle.dumps directly because of inadequate argument signature 
sqlite3.register_converter("tuple", pickle.loads)

def collate_tuple(string1, string2):
    return cmp(pickle.loads(string1), pickle.loads(string2))

#########################
# 1) Using declared types
con = sqlite3.connect(":memory:", detect_types=sqlite3.PARSE_DECLTYPES)

con.create_collation("cmptuple", collate_tuple)

cur = con.cursor()
cur.execute("create table test(p tuple unique collate cmptuple) ")
cur.execute("create index tuple_collated_index on test(p collate cmptuple)")

cur.execute("select name, type  from sqlite_master") # where type = 'table'")
print(cur.fetchall())

p = (1,2,3)
p1 = (1,2)

cur.execute("insert into test(p) values (?)", (p,))
cur.execute("insert into test(p) values (?)", (p1,))
cur.execute("insert into test(p) values (?)", ((10, 1),))
cur.execute("insert into test(p) values (?)", (tuple((9, 33)) ,))
cur.execute("insert into test(p) values (?)", (((9, 5), 33) ,))

try:
    cur.execute("insert into test(p) values (?)", (tuple((9, 33)) ,))
except Exception as e:
    print e

cur.execute("select p from test order by p")
print "\nwith declared types and default collate on column:"
for raw in cur:
    print raw

cur.execute("select p from test order by p collate cmptuple")
print "\nwith declared types collate:"
for raw in cur:
    print raw

con.create_function('pycmp', 2, cmp)

print "\nselect grater than using cmp function:"
cur.execute("select p from test where pycmp(p,?) >= 0", ((10, ),) )
for raw in cur:
    print raw

cur.execute("select p from test where pycmp(p,?) >= 0", ((3,)))
for raw in cur:
    print raw 

print "\nselect grater than using collate:"
cur.execute("select p from test where p > ?", ((10,),) )
for raw in cur:
    print raw  

cur.execute("explain query plan select p from test where p > ?", ((3,)))
for raw in cur:
    print raw

cur.close()
con.close()
pervlad
  • 41
  • 2