0

i'm doing a select from a table with 50 million rows with 'select where in' on an indexed field(varchar(255)); the sql is like this

select Fid, Fnumber from t_word where Fnumber in (a,b,c)

where in fact (a,b,c) is a list of 500 elements

in navicat, the sql returns within 0.12s

however with pymysql 0.9.3

l = (a,b,c)
sql = 'select Fid, Fnumber from t_word where Fnumber in %s'
cursor.execute(sql, [l])
result = cursor.fetchall()

took 60 seconds. i think pymysql is not using index

did i format my sql wrong?

pymysql select in with variable number of parameters i've followed code here

my mysql version is 5.7.18-cdb20170530-log

the where field is indexed USING BTREE

vansdev
  • 43
  • 4
  • the sql query it's right but it seems to be an issue relate to pymysql parser ( https://github.com/PyMySQL/PyMySQL/issues/475 , https://github.com/PyMySQL/PyMySQL/issues/297 ). The point is that there is a bottleneck for somereason. I sincerly advice you to move on other connector like you said if you can or at least limit the number of data to query . Maybe you can chek with different table dimension how it work and divide into muliple dataset to query. Maybe you can also try to rewrite query separating for example into two the list : for ex. (a , b) (c) – BUcorp Oct 17 '19 at 14:30
  • i switched to MySQLdb and still get the same performance – vansdev Oct 18 '19 at 02:25
  • exactly how much time take with mysqldb? try also via mysqlworkbench if you can. Just to know the time you get in there. If you get a sensible time then as i suspect it's all about the connector wich manage the data returned. I found also this interesting article https://charlesnagy.info/it/python/python-mysqldb-vs-mysql-connector-query-performance talking about – BUcorp Oct 18 '19 at 07:41

0 Answers0