1

So I have a collection of recorded rat vocalizations (that's right the animal). There are several rats I have recordings from. In addition, each recording is sampled at 300e3 Hz and is several ms in duration. So each recording consists of tens of thousands of floats stored in a column called values. I've stored all this in an sqlite db, but I have a question about the optimal way to organize it. Right now I have a composite primary key (rat_id char(2), recording_# INT, time_step INT). The recording_# starts at 0 for each new rat_id, and the time_step starts at 0 for each new recording_#. The problem comes in when I try to delete a recording and deincrement all values for recording_index greater than the one I deleted. This of course takes forever because it has to do millions of operations each time this happens.

cur.execute("DELETE FROM recordings WHERE ratid=? AND recording_index=?", (self.rat, recording_index))
cur.execute("UPDATE recordings SET recording_index = -(recording_index-1) WHERE ratid=? AND recording_index>?", (self.rat, recording_index))
cur.execute("UPDATE recordings SET recording_index = -recording_index WHERE ratid=? AND recording_index<0", (self.rat,))

So is there a better way to organize the table such that the deincrement operation doesn't take as long, or should I just not bother with the deincrement operation. I don't really need to do it. I'm just being kind of overly concerned with details.

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
mdornfe1
  • 1,982
  • 1
  • 24
  • 42
  • Don't store index positions. Store sequence numbers. Use ordering. Embrace the gaps. – user2864740 Nov 19 '13 at 18:21
  • What do you mean by sequence numbers? – mdornfe1 Nov 19 '13 at 18:30
  • A value that supplies an ordering within a sequence, without referring to a dense index - needs not be sequential. – user2864740 Nov 19 '13 at 18:33
  • Sorry, maybe you can point me to some docs, because I'm not catching your meaning. – mdornfe1 Nov 19 '13 at 18:35
  • Don't update the recording_index. Databases aren't arrays. Then recording_index is no longer sequential over [0..n), which is most likely fine. – user2864740 Nov 19 '13 at 18:35
  • Do you ever need to access individual float values? Why don't you store the recordings as blobs? – CL. Nov 19 '13 at 19:19
  • I don't need to access individual floats. So I'm doing this through python. Can I store a numpy array as a blob? – mdornfe1 Nov 19 '13 at 19:30
  • Apparently you can't store np arrays, but i figured out how to register your own datatypes http://stackoverflow.com/questions/18621513/python-insert-numpy-array-into-sqlite3-database – mdornfe1 Nov 19 '13 at 21:52

0 Answers0