SQLite3 query takes a lot of time, how would you do it?

Question

I have a sqlite3 DB table of around 250 000 rows, my code is written in python. I need to filter it in very specific wat, and it takes much too long time.

Table is as follows:

self.cur.execute("""create table DetectedVehicles(IdD INTEGER PRIMARY KEY, 
                                    CLCode INT, 
                                    DetectionTime INT,
                                    PlateNo VARCHAR)""")

It's a table for automatic plate number recognition result filtering. And I need to filter it to get (native sql-like statements :) ):

Get rows from table DetectedVehicles where vehicles were observed at 
CLCode="X" before they were observed at CLCode="Y". 
(implicite: they were observed at both of them)

So I need to get list of detectedvehicles, that crossed specific CLCodes in proper sequence, i.e. Y before X.

I managed to create something that is working, but it takes about 10seconds for the query. Is there a faster way?

The code goes here:

self.cur.execute('select distinct PlateNo from DetectedVehicles where CLCode=? intersect select PlateNo from DetectedVehicles where CLCode=?',(CountLocationNo[0],CountLocationNo[1]))
    PlatesTab=list(self.cur)
    Results=[]
    for Plate in PlatesTab:
        PlateQ1='select * from DetectedVehicles where PlateNo in (?) and ((select DetectionTime from DetectedVehicles where CLCode = ? and PlateNo in (?) ) <  (select DetectionTime from DetectedVehicles where CLCode = ? and PlateNo in (?)))'     
        R=list(self.cur.execute(PlateQ1,(Plate,CountLocationNo[0],Plate,CountLocationNo[1],Plate)))
        if R:
            TimesOD=self.curST2.execute('select DetectionTime from DetectedVehicles where PlateNo in (?) and (CLCode= ? or CLCode=?)',(Plate,CountLocationNo[0],CountLocationNo[1])).fetchall()
            if TimesOD:
               TravelTimes.append(TimesOD[1][0]-TimesOD[0][0])
               DetectionTimes.append(TimesOD[0][0])
            for i in R:
                Results.append(i[0])
    Results=tuple(Results)
    QueryCL=' intersect select * from DetectedVehicles where IDd in ' + str(Results)

Thanks in advance

Can you add your existing SQL code to the query? Does your table have any indexes on it? — , Mar 23 '12 at 11:39
Yes it has indexes - (distinct integer for every new detected vehicle). I'll add queries to my question via edit in a minute. — Intelligent-Infrastructure, Mar 23 '12 at 11:50
You need indexes on the fields you are using for the JOIN and WHERE, i.e. PlateNo and CLCode. Indexes on any other field will not help. — Ben, Mar 23 '12 at 13:14

score 2 · Answer 1 · answered Mar 23 '12 at 12:05

2

You can do it all in a single query.

select 
    dv1.PlateNo, dvPoint1.DetectionTime, dvPoint2.DetectionTime
from 
    DetectedVehicles dvPoint1 
    inner join DetectedVehicles dvPoint2
        on dvPoint1.PlateNo = dvPoint2.PlateNo
        and dvPoint1.CLCode = ? and dvPoint2.CLCode = ?
        and dvPoint1.DetectionTime < dvPoint2.DetectionTime

You will want an index on (PlateNo, DetectionTime, CLCode), or (CLCode, PlateNo). Try them both to see which is faster. PlateNo on it's own may do.

answered Mar 23 '12 at 12:05

Ben

34,935
6
74
113

Dear Ben, thank You for your solution. Query itself is quickest, but how can I get to the results? - fetchall() via sqlite in Python takes a lot of time? Why, and how to do it? – Intelligent-Infrastructure Mar 23 '12 at 12:47
Add the indexes I recommended to speed up `fetchall()`. The query time is not a realistic measure you need to time the whole thing including the `fetchall()`. – Ben Mar 23 '12 at 13:15

score 1 · Accepted Answer · answered Mar 23 '12 at 12:06

1

Try:

select distinct x.*
from DetectedVehicles x
join DetectedVehicles y
  on x.PlateNo = y.PlateNo and 
     x.DetectionTime < y.DetectionTime
where x.CLCode=? and y.CLCode=?

or:

select x.*
from DetectedVehicles x
where exists
(select 1
 from DetectedVehicles y
 where x.PlateNo = y.PlateNo and 
       x.DetectionTime < y.DetectionTime and
       x.CLCode=? and y.CLCode=?)

I would normally expect the latter query to execute more quickly, but it would be worth running both to check.

answered Mar 23 '12 at 12:06

Dear Mark, thank You for your solution. Query itself is quickest, but how can I get to the results? - fetchall() via sqlite takes a lot of time? Why, and how to do it? – Intelligent-Infrastructure Mar 23 '12 at 12:46
I regret that I am not familiar with the code you are using to retrieve the results, and therefore I am unable to advise on how to improve its performance. – Mar 23 '12 at 12:51

score 0 · Answer 3 · edited May 23 '17 at 12:27

Thank You guys for this feedback. I post it as an answer, and present time results:

1. fastest total (query 1.80s, fetchall 0.20s, total: 2s)

select distinct x.*
from DetectedVehicles x
join DetectedVehicles y
  on x.PlateNo = y.PlateNo and 
     x.DetectionTime < y.DetectionTime
where x.CLCode=? and y.CLCode=?

2. (query 1.83s, fetchall 0.19s, total: 2.02s)

select 
    dvPoint1.PlateNo, dvPoint1.DetectionTime, dvPoint2.DetectionTime
from 
    DetectedVehicles dvPoint1 
    inner join DetectedVehicles dvPoint2
        on dvPoint1.PlateNo = dvPoint2.PlateNo
        and dvPoint1.CLCode = ? and dvPoint2.CLCode = ?
        and dvPoint1.DetectionTime < dvPoint2.DetectionTime

3. (query 1.82s, fetchall 1.09s, total: 2.91s)

select x.*
from DetectedVehicles x
where exists
(select 1
 from DetectedVehicles y
 where x.PlateNo = y.PlateNo and 
       x.DetectionTime < y.DetectionTime and
       x.CLCode=? and y.CLCode=?)

So thanks @Mark Bannister for Your answer, I'm going to accept it.

However one issue remains: cur.fetchall() takes enourmously long time.. and I need to get the results, how should I do it ? (For just a 100 of rows it takes around 2 minutes for each of your solutions). Solved issue: download new sqlite.dll to your python/dlls folder ... don't ask me why: Join with Pythons sqlite module is slower than doing it manually

You need to time the whole operation including `cur.fetchall()` to get a realistic answer. Then you need to create indexes to support the query. — Ben, Mar 23 '12 at 13:13

SQLite3 query takes a lot of time, how would you do it?

3 Answers3