0

There are list1 and list2, each contains 1,104,824 values

table1 has 350,000,000 rows with 3 columns: ID, name1, name2

and this is what I tried to do:

con = mdb.connect('localhost','user','password','db')
cur = con.cursor()
for i in range(1104824)
    sql ="select count(distinct(a.ID)) from (select name1 ,ID from table1 where name2 <> '"+str(list1[i])+"') as a where a.name1 = '"+str(list2[i])+"'"
    cur.execute(sql)
    data = cur.fetchone()[0]

but it is very very slow. Is there any faster way to do this query?

dPdms
  • 173
  • 2
  • 14
  • post the table structure and what exactly you are trying to do. Surely there might be a way that does not involve 1.1 million queries? – e4c5 May 10 '16 at 02:40
  • If `ID` is the `PRIMARY KEY`, you can probably change `COUNT(DISTINCT ID)` to `COUNT(*)`. If `name1,name2` is UNIQUE, you can probably get rid of `ID`. – Rick James May 10 '16 at 04:43

2 Answers2

1

This is your query:

select count(distinct a.ID)
from (select name1, ID
      from table1
       where name2 <> '"+str(list1[i])+"'
      ) a
where a.name1 = '"+str(list2[i])+"'";

I would recommend writing this as:

select count(distinct ID)
from table1
where name2 <> '"+str(list1[i])+"' and
      name1 = '"+str(list2[i])+"'";

Then you can speed up the query with an index on table1(name1, name2, id) -- all three columns in that order.

Note: I would write the sql as:

    sql = """
select count(distinct ID)
from table1
where name2 <> '{0}' and name1 = '{1}'
""".format(str(list1[i]), str(list2[i]))
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
1

Seems like this would work as well with the appropriate indices:

select count(distinct id) 
from table1
where name2 <> 'Name1'
   and name1 = 'Name2'

Look into using parameterized queries though. Your query is vulnerable to sql injection and would break for names with apostrophes for example... Lots of examples out there, here's a couple: Python MySQL Parameterized Queries and https://stackoverflow.com/a/1633589/1073631

Community
  • 1
  • 1
sgeddes
  • 62,311
  • 6
  • 61
  • 83