2

I have a simple ISAM table, about 2.2GB with 1.6 million records. For purposes of this question, the query is

SELECT 'one_unindexed_field' WHERE indexed_field > constant;

The only unusual thing about the table is that it has 1 large unindexed column VARCHAR(8192).

If 'one_unindexed_field' = the large VARCHAR field then the query takes about 7 times as long as the same query with any other field. This factor scales roughly the same for any number of records returned (say, 1000 to 100,000), so presumably you can assume the set of records returned fits into memory easily.

mysql> show global status like 'created_tmp_disk_tables'; 

reports that zero tmp_disk_tables are being created. EXPLAIN returns the same results for either query.

How can I speed up my queries on this table? If it's not possible, can someone explain what is going on?

key_buffer_size=256M
tmp_table_size=64M
max_heap_table_size=64M
myisam_sort_buffer_size=88M
read_buffer_size=1M
read_rnd_buffer_size=2M

Edit: Got some hits suggesting that changing ROW_FORMAT to FIXED would probably speed up my query ... so I did that, and it actually made the query slightly slower.

Edit: I'm on Win10 64-bit, Server version: 5.7.16-log MySQL Community Server (GPL)

EXPLAIN returns this:

mysql> EXPLAIN SELECT skw_stk_vol FROM tbl_skews WHERE (tday_date >= 42795);    
 id     1    
 select_type     SIMPLE          
 table       tbl_skews     
 partitions      NULL           
 type    range     
 possible_keys           ndx_skews_tday_date     
 key                     ndx_skews_tday_date     
 key_len    4    
 ref     NULL     
 rows       406921    
 filtered   100    
 Extra                   Using index condition 
Juan Carlos Oropeza
  • 47,252
  • 12
  • 78
  • 118
dts
  • 125
  • 1
  • 10
  • you can make view and select your data from view – Abanoub Makram Apr 20 '17 at 16:14
  • @AbanoubMakram that is nonsense. A view does not store data and simply referencing a view instead of the table to which it refers will not speed up the selection of data from the table. – JNevill Apr 20 '17 at 16:42
  • @OP is the `indexed_field` the large field or is `one_unindexed_field` the large text field? Curious if it's the selection of the field or the filtering of the field that is causing slow read. – JNevill Apr 20 '17 at 16:43
  • @JNevill i think the view may speed the selection this my opinion may it's wrong, and i'll be happy to correct it, Thnaks – Abanoub Makram Apr 20 '17 at 16:45
  • Assuming it's merely the selection and not the filtering, take a look at [this q&a](http://stackoverflow.com/questions/38308717/mysql-slow-with-large-text-fields-in-table) where the Answerer suggests changing ROW_FORMAT to `Dynamic` to allow more of the very large text fields to be squeezed into the same block, thus decreasing retrieval times from disk. This should help since it sounds like I/O is the main cost factor for this query. – JNevill Apr 20 '17 at 16:46
  • indexed_field is a separate fieid I'm keeping out of the conversation if possible ... I'm talking about selecting a "small" unindexed numeric field versus the big (unindexed) VARCHAR field. – dts Apr 20 '17 at 16:47
  • I think ROW_FORMAT is just for innodb, my table is ISAM. I started this project with innodb but ditched it for isam since this improved performance greatly. – dts Apr 20 '17 at 16:55
  • See edit above ... tried changing ROW_FORMAT, did not help. – dts Apr 20 '17 at 18:09
  • Would it make sense to [partition](https://dev.mysql.com/doc/refman/5.7/en/partitioning.html) the table so that the DB can multithread the query and only hit the partitions with index > constant? or if the value isn't populated on the majority of the records simulate a [function based index](http://stackoverflow.com/questions/10595037/is-it-possible-to-have-function-based-index-in-mysql) on the length of the field and only return records > 0 as part the index? Seems 100,000 out of 1.6 mil could save time if we eliminated all the 0's to begin with... – xQbert Apr 20 '17 at 18:21
  • Interesting... Generally speaking, the lookup should be the most expensive op, once the engine locates the record - it must be pretty fast to fetch it. And you have indexed field for lookup. But something stepped in in this case... Two questions here: what exactly EXPLAIN shows, and - did you try InnoDB by the same circumstances? – Yuri G Apr 20 '17 at 18:24
  • Ah, yeah, and the third: what environment are you on? – Yuri G Apr 20 '17 at 18:27
  • Partitioning would help, but would be hitting the problem with a sledgehammer ... last resort. I really want to know what is going on under the hood here ... Yuri, I will respond by editing the OP (have not tried innodb since I first rejected it for this project due to slower queries). – dts Apr 20 '17 at 18:54

1 Answers1

0

If you want improve this query try create a composite index to avoid the look up on the table after finding the matching rows.

(indexed_field, 'one_unindexed_field')`

But the thing is you dont say how much time take that query. Brign a large varchar field will always be slower that a integer just because the data will be larger.

So if a query like this work then there isnt much more you can do.

SELECT `integer_unindexed_field` WHERE indexed_field > constant;

Because the problem isnt finding the row, is just returning the result data.

Juan Carlos Oropeza
  • 47,252
  • 12
  • 78
  • 118
  • I've got the right indexes ... only thing that would help is a composite covering the VARCHAR, but that's not possible with VARCHAR(8192). The bottom line, I've learned, is that mySQL is useless for any project that is slightly unusual. Example: I'm running the query for this project usually just once per day, and rebooting the server once per day. First query after server reboot is taking 218 seconds, subsequent queries take 2 seconds ... I could move this table to Jet MDBs and it would be fast, stable and scalable instead of garbage. – dts Apr 21 '17 at 21:55