0

I have a simple database with large amount of rows.

CREATE TABLE `tbl`(
    `id` INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
    `name` NCHAR(50) NOT NULL
    ... other fields ...
  ) Engine=innodb;

I want to make paging imitation. And I know that classic offset is slow operation. Thats why I tried a trick from this website

My version of query is

SELECT * FROM `tbl` JOIN (SELECT id FROM `tbl` ORDER BY id LIMIT 1000000, 100) as b on `b`.`id` = `tbl`.`id`;

but the trick wasn't effective. The query works extremely slow, because mysql still reads all 1000000 lines.

And even separated inner query works witn primary keys only, but doesn't use a b-tree.

SELECT id FROM `tbl` ORDER BY id LIMIT 1000000, 100

So, why mysql engine doesn't use primary index when this query is totally covered with it?

Also I can't use where id > SOME_NUMBER condition because my table is very sparsed, and i don't know boundary ids of random page that was requested by user.

Panther
  • 3,312
  • 9
  • 27
  • 50
plflok
  • 13
  • 4

1 Answers1

2

Consider the following, where I force 2.2M rows into a table

Schema

create table tbl
(   id int auto_increment primary key,
    thing int not null
)engine=MyISAM;

insert tbl(thing) values (7),(7),(7),(7),(7),(7),(7),(7),(7),(7),(7),(7),(7),(7),(7),(7),(7);

insert into tbl(thing) select thing from tbl;
insert into tbl(thing) select thing from tbl;
insert into tbl(thing) select thing from tbl;
insert into tbl(thing) select thing from tbl;
insert into tbl(thing) select thing from tbl;
insert into tbl(thing) select thing from tbl;
insert into tbl(thing) select thing from tbl;
insert into tbl(thing) select thing from tbl;
insert into tbl(thing) select thing from tbl;
insert into tbl(thing) select thing from tbl;
insert into tbl(thing) select thing from tbl;
insert into tbl(thing) select thing from tbl;
insert into tbl(thing) select thing from tbl;
insert into tbl(thing) select thing from tbl;
insert into tbl(thing) select thing from tbl;
insert into tbl(thing) select thing from tbl;
insert into tbl(thing) select thing from tbl;

The Counts

select count(*) as theCount,max(id) as theMax,min(id) as thMin from tbl;
+----------+---------+-------+
| theCount | theMax  | thMin |
+----------+---------+-------+
|  2228224 | 2228224 |     1 |
+----------+---------+-------+

Query A (yours, uses a derived table)

explain 
SELECT *  
FROM `tbl` 
JOIN 
(SELECT id FROM `tbl` ORDER BY id LIMIT 1000000, 100) as b 
on `b`.`id` = `tbl`.`id`;
+----+-------------+------------+--------+---------------+---------+---------+------+---------+-------------+
| id | select_type | table      | type   | possible_keys | key     | key_len | ref  | rows    | Extra       |
+----+-------------+------------+--------+---------------+---------+---------+------+---------+-------------+
|  1 | PRIMARY     | <derived2> | ALL    | NULL          | NULL    | NULL    | NULL | 1000100 | NULL        |
|  1 | PRIMARY     | tbl        | eq_ref | PRIMARY       | PRIMARY | 4       | b.id |       1 | NULL        |
|  2 | DERIVED     | tbl        | index  | NULL          | PRIMARY | 4       | NULL | 2228224 | Using index |
+----+-------------+------------+--------+---------------+---------+---------+------+---------+-------------+

Query B (not a derived table)

explain 
SELECT t1.*  
FROM tbl t1 
JOIN tbl t2 
on t2.id = t1.id 
where t2.id>1000000 
limit 100 
+----+-------------+-------+--------+---------------+---------+---------+--------------------+---------+--------------------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref                | rows    | Extra                    |
+----+-------------+-------+--------+---------------+---------+---------+--------------------+---------+--------------------------+
|  1 | SIMPLE      | t2    | range  | PRIMARY       | PRIMARY | 4       | NULL               | 1195836 | Using where; Using index |
|  1 | SIMPLE      | t1    | eq_ref | PRIMARY       | PRIMARY | 4       | so_gibberish.t2.id |       1 | NULL                     |
+----+-------------+-------+--------+---------------+---------+---------+--------------------+---------+--------------------------+

For those not familiar with using Explain, see a write-up I did here

Community
  • 1
  • 1
Drew
  • 24,851
  • 10
  • 43
  • 78
  • Does main distinction of Query B is "where"-condition insread of "offset"? As I mentioned, I don't know boundary ids. So, I haven't id for substitution in condition, if user requires random pages, not sequential ones. Anyway your answer helped to realize much complex solution, thank you. – plflok Sep 27 '15 at 22:37