0

I'd like to do analytics on a large table (10,000,000 rows / 40 columns). So I've loaded the table into SQLite to query it. I have a query like

SELECT x, y FROM table WHERE x in (10,20,30)

Now I've noticed that this is incredibly slow in SQLite, since I have an index on x only. I guess this is due to hard-drive access which jumps between different places in the file. A covering index (x, y) would solve the issue, but I don't want to create a covering index for every type of query I do.

So my question is: Can I tune SQLite to be faster? Or alternatively which database is likely to be faster than my solution (are other SQL DBs faster)?

PS: What's the actual reason the query is so slow?

Gere
  • 12,075
  • 18
  • 62
  • 94
  • How many records does you query return? – juergen d Oct 24 '14 at 12:01
  • Here are two resources to think about: http://stackoverflow.com/questions/784173/what-are-the-performance-characteristics-of-sqlite-with-very-large-database-file and https://www.sqlite.org/whentouse.html. – Gordon Linoff Oct 24 '14 at 12:04
  • The query returns a sizeable chunk of the rows. It depends on the analysis but can be as much as 10%. I've already tried VACUUM etc and couldn't find anything else on the internet. And this is very slow only the first time it is executed - so most likely something related to hard-drive? – Gere Oct 24 '14 at 12:07
  • Creating index on `(x, y)` pair has no sense here. You don't search by those columns, only by `x`. Index has nothing to do with what you return from the query. – Googie Oct 24 '14 at 13:21
  • @Googie: Oh, it makes a speed difference of at least a factor 1000. With the double index, the engine doesn't have to read the rows at all and only reads the index. With a single index it constantly jumps between index and row data. – Gere Oct 24 '14 at 13:52
  • Can you explain how this query is useful? – Aducci Oct 24 '14 at 15:48
  • @Aducci I'm sure what the question is. I basically sub-select by some variable and I want to get (part of) the rows in that table. – Gere Oct 24 '14 at 19:34
  • @Gerenuk What do you do with the result? Is this a one time thing? Is it going to be part of a report? – Aducci Oct 24 '14 at 19:40
  • @Aducci It is like a one-time thing. Someday I decide to do some analysis, so I load these rows into memory (computer program). I might do this query or something similar next day again. Depending on the analysis. – Gere Oct 25 '14 at 08:05

0 Answers0