I have a very slow query where the first part is created by a gem (https://github.com/CanCanCommunity/cancancan, it creates the select and the inner query) and where I add an ORDER BY
and LIMIT
for a cursor based pagination.
SELECT `spree_products`.*
FROM `spree_products`
WHERE `spree_products`.`id` IN
(SELECT `spree_products`.`id`
FROM `spree_products`
LEFT OUTER JOIN `spree_vendors` ON `spree_vendors`.`id` = `spree_products`.`vendor_id`
WHERE `spree_vendors`.`active` = TRUE)
ORDER BY `spree_products`.`id` ASC
LIMIT 50;
=> 50 rows in set (1 min 3.48 sec)
This are the tables:
CREATE TABLE `spree_products` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`available_on` datetime DEFAULT NULL,
`permalink` varchar(255) DEFAULT NULL,
`created_at` datetime NOT NULL,
`updated_at` datetime NOT NULL,
`count_on_hand` int(11) DEFAULT NULL,
`vendor_id` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `index_spree_products_on_vendor_id` (`vendor_id`)
) ENGINE=InnoDB AUTO_INCREMENT=37209248 DEFAULT CHARSET=utf8mb4
CREATE TABLE `spree_vendors` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) DEFAULT NULL,
`active` tinyint(1) DEFAULT '0',
`created_at` datetime NOT NULL,
`updated_at` datetime NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=4413 DEFAULT CHARSET=utf8mb4
(I removed unneccessary fields to keep it tidy)
The EXPLAIN
on the query above returns this:
+----+-------------+----------------+------------+--------+-------------------------------------------+-----------------------------------+---------+--------------------------------+------+----------+----------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------------+------------+--------+-------------------------------------------+-----------------------------------+---------+--------------------------------+------+----------+----------------------------------------------+
| 1 | SIMPLE | spree_vendors | NULL | ALL | PRIMARY | NULL | NULL | NULL | 3465 | 10.00 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | spree_products | NULL | ref | PRIMARY,index_spree_products_on_vendor_id | index_spree_products_on_vendor_id | 5 | _hubert_test.spree_vendors.id | 8613 | 100.00 | Using index |
| 1 | SIMPLE | spree_products | NULL | eq_ref | PRIMARY | PRIMARY | 4 | _hubert_test.spree_products.id | 1 | 100.00 | NULL |
+----+-------------+----------------+------------+--------+-------------------------------------------+-----------------------------------+---------+--------------------------------+------+----------+----------------------------------------------+
When I remove the ORDER BY
the query is fast:
SELECT `spree_products`.*
FROM `spree_products`
WHERE `spree_products`.`id` IN
(SELECT `spree_products`.`id`
FROM `spree_products`
LEFT OUTER JOIN `spree_vendors` ON `spree_vendors`.`id` = `spree_products`.`vendor_id`
WHERE `spree_vendors`.`active` = TRUE)
LIMIT 50;
=> 50 rows in set (0.00 sec)
When I keep the ORDER BY
part from the outer query, but remove the WHERE
part from the sub query, the query also is fast:
SELECT `spree_products`.*
FROM `spree_products`
WHERE `spree_products`.`id` IN
(SELECT `spree_products`.`id`
FROM `spree_products`
LEFT OUTER JOIN `spree_vendors` ON `spree_vendors`.`id` = `spree_products`.`vendor_id`)
ORDER BY `spree_products`.`id` ASC
LIMIT 50;
I tried adding a composite index to spree_vendors.id / spree_vendors.active
, but that didn't help.
Any idea, on how to optimise this query?
UPDATE 1:
A JOIN
Variant of this is also slow. The DISTINCT
is added by the gem to prevent duplicate records in case you don't select all columns:
SELECT DISTINCT `spree_products`.*
FROM `spree_products`
LEFT OUTER JOIN `spree_vendors` ON `spree_vendors`.`id` = `spree_products`.`vendor_id`
WHERE `spree_vendors`.`active` = TRUE
ORDER BY `spree_products`.`id` ASC
LIMIT 50;
=> 50 rows in set (1 min 43.13 sec)
Without the DISTINCT
the query is fast.
UPDATE 2
It was pointed out, that using a LEFT OUTER JOIN
inside the sub query returns the whole table. But when using an INNER JOIN
it still is slow:
SELECT `spree_products`.*
FROM `spree_products`
WHERE `spree_products`.`id` IN
(SELECT `spree_products`.`id`
FROM `spree_products`
INNER JOIN `spree_vendors` ON `spree_vendors`.`id` = `spree_products`.`vendor_id`
WHERE `spree_vendors`.`active` = TRUE)
ORDER BY `spree_products`.`id` ASC
LIMIT 50;
=> 50 rows in set (1 min 3.98 sec)