1

I have two tables. products and product_images.

Products: ~ 6 276 445 rows.

ProductImages: ~ 22 888 685 rows.

Table definitions:

products

CREATE TABLE `products` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `unique_id` varchar(32) COLLATE utf8mb4_unicode_ci NOT NULL,
  `feed_id` bigint(20) unsigned NOT NULL,
  `feed_item_id` varchar(100) COLLATE utf8mb4_unicode_ci NOT NULL,
  `item_group_id` varchar(100) COLLATE utf8mb4_unicode_ci NOT NULL,
  `name` varchar(200) COLLATE utf8mb4_unicode_ci NOT NULL,
  `slug` varchar(200) COLLATE utf8mb4_unicode_ci NOT NULL,
  `description` text COLLATE utf8mb4_unicode_ci NOT NULL,
  `categorytext` text COLLATE utf8mb4_unicode_ci NOT NULL,
  `categorytext_hash` varchar(32) COLLATE utf8mb4_unicode_ci NOT NULL,
  `manufacturer` varchar(100) COLLATE utf8mb4_unicode_ci NOT NULL,
  `product_url` text COLLATE utf8mb4_unicode_ci NOT NULL,
  `price_vat` float(12,2) DEFAULT NULL,
  `price_vat_old` float(12,2) DEFAULT NULL,
  `vat` tinyint(4) DEFAULT NULL,
  `discount_percentage` tinyint(4) NOT NULL DEFAULT 0,
  `image_source_url` text COLLATE utf8mb4_unicode_ci NOT NULL,
  `image_filename` varchar(200) COLLATE utf8mb4_unicode_ci NOT NULL,
  `gtin` varchar(14) COLLATE utf8mb4_unicode_ci NOT NULL,
  `ean` varchar(13) COLLATE utf8mb4_unicode_ci NOT NULL,
  `isbn` varchar(13) COLLATE utf8mb4_unicode_ci NOT NULL,
  `upc` varchar(12) COLLATE utf8mb4_unicode_ci NOT NULL,
  `mpn` varchar(70) COLLATE utf8mb4_unicode_ci NOT NULL,
  `missing_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
  `created_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
  `updated_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
  `deleted_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
  `project_1` tinyint(4) NOT NULL DEFAULT 0,
  `project_2` tinyint(4) NOT NULL DEFAULT 0,
  PRIMARY KEY (`id`),
  UNIQUE KEY `unique_id_UNIQUE` (`unique_id`),
  KEY `feed_id_updated_at` (`feed_id`,`updated_at`),
  KEY `feed_id` (`feed_id`),
  KEY `feed_id_missing_at` (`feed_id`,`missing_at`),
  KEY `feed_id_categorytext_hash_id` (`feed_id`,`categorytext_hash`,`id`),
  KEY `categorytext_hash` (`categorytext_hash`),
  KEY `missing_at_deleted_at_categorytext_hash_feed_id_id` (`missing_at`,`deleted_at`,`categorytext_hash`,`feed_id`,`id`),
  KEY `project_1_id` (`project_1`,`id`),
  KEY `project_2_id` (`project_2`,`id`),
  KEY `project_1` (`project_1`),
  KEY `project_2` (`project_2`),
  CONSTRAINT `fk_products_feeds` FOREIGN KEY (`feed_id`) REFERENCES `feeds` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=122268834 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

product_images

CREATE TABLE `product_images` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `product_unique_id` varchar(32) COLLATE utf8mb4_unicode_ci NOT NULL,
  `unique_id` varchar(32) COLLATE utf8mb4_unicode_ci NOT NULL,
  `feed_id` bigint(20) unsigned NOT NULL,
  `image_source_url` text COLLATE utf8mb4_unicode_ci NOT NULL,
  `image_filename` varchar(200) COLLATE utf8mb4_unicode_ci NOT NULL,
  `created_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
  `updated_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
  `deleted_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
  PRIMARY KEY (`id`),
  UNIQUE KEY `unique_id_UNIQUE` (`unique_id`),
  KEY `feed_id_updated_at` (`feed_id`,`updated_at`),
  KEY `product_unique_id` (`product_unique_id`),
  KEY `product_unique_id_id` (`product_unique_id`,`id`),
  CONSTRAINT `fk_product_images_products1` FOREIGN KEY (`product_unique_id`) REFERENCES `products` (`unique_id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=333584333 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

My Query without ORDER BY

SELECT IMG.id
    FROM products PR
    INNER JOIN product_images IMG on IMG.product_unique_id = PR.unique_id
    WHERE 
        PR.project_1 = 1
        -- ORDER BY IMG.id
    LIMIT 1000 OFFSET 0

Duration: 0.016 sec

EXPLAIN:

id|select_type|table|type|possible_keys                          |key              |key_len|ref                          |rows  |Extra      |
--+-----------+-----+----+---------------------------------------+-----------------+-------+-----------------------------+------+-----------+
 1|SIMPLE     |PR   |ref |unique_id_UNIQUE,project_1_id,project_1|project_1        |1      |const                        |286960|           |
 1|SIMPLE     |IMG  |ref |product_unique_id,product_unique_id_id |product_unique_id|130    |products-storage.PR.unique_id|2     |Using index|

My Query with ORDER BY

SELECT IMG.id
    FROM products PR
    INNER JOIN product_images IMG on IMG.product_unique_id = PR.unique_id
    WHERE 
        PR.project_1 = 1
    ORDER BY IMG.id
    LIMIT 1000 OFFSET 0

Duration 17.922 sec

EXPLAIN:

id|select_type|table|type|possible_keys                          |key              |key_len|ref                          |rows  |Extra                          |
--+-----------+-----+----+---------------------------------------+-----------------+-------+-----------------------------+------+-------------------------------+
 1|SIMPLE     |PR   |ref |unique_id_UNIQUE,project_1_id,project_1|project_1        |1      |const                        |286960|Using temporary; Using filesort|
 1|SIMPLE     |IMG  |ref |product_unique_id,product_unique_id_id |product_unique_id|130    |products-storage.PR.unique_id|2     |Using index                    |

Problem

Query with order by is too slow. Is there a way to speed up query with some index or server settings? I need order by for pagination.

The goal is to select all images for project and send them through the api to another server.

I'm using mariadb 10.8.2

Server is on SSD with 8GB RAM

MoreMeg
  • 33
  • 6
  • 2
    Run `EXPLAIN` and show us the results, and consider `USING INDEX` – Barry Carter Jul 18 '22 at 15:38
  • [How to Optimize Queries in a Database - The Basics](https://stackoverflow.com/q/3191623/3404097) [What are your most common sql optimizations?](https://stackoverflow.com/q/1332778/3404097) [When and why are database joins expensive?](https://stackoverflow.com/q/173726/3404097) [Tips for asking a good SQL question](https://meta.stackoverflow.com/q/271055/3404097) [Asking query performance questions](https://dba.meta.stackexchange.com/q/3034/43932)) [How much research effort is expected of Stack Overflow users?](https://meta.stackoverflow.com/q/261592/3404097) [ask] [Help] [mre] – philipxy Jul 18 '22 at 20:27
  • 1
    To help you with your [tag:query-optimization] question, we need to see your table definitions, your index definitions, and the output of EXPLAIN. Please [read this](https://stackoverflow.com/tags/query-optimization/info), then [edit] your question. – O. Jones Jul 19 '22 at 00:47
  • `ORDER BY ... LIMIT ...` is a notorious performance antipatern. Of course, without ORDER BY the MariaDB server is free to return its result set in whatever order it wants. So, it returns the first thousand rows it generates then stops. With ORDER BY it has to accumulate all those rows then sort them. It is possible that a compound covering index on `product_images` containing that `id` column might be scannable in a way where the ORDER BY is provided by the index. But we gotta see your tables with indexes, and your EXPLAIN output to suggest anything. – O. Jones Jul 19 '22 at 00:53
  • i advice to use seek method – Taurus Silver Jul 19 '22 at 09:32
  • Unless I'm missing something, do you have an index on `IMG.id`? – Barry Carter Jul 19 '22 at 14:57
  • @barrycarter IMG.id is a PRIMARY key – MoreMeg Jul 19 '22 at 16:55
  • @TaurusSilver seek method is slow too because I must use order by.. I think I must use some denormalization method and not use inner join with order by – MoreMeg Jul 19 '22 at 16:57
  • You can try `USING INDEX` but I'm not sure that helps with `ORDER BY`-- it usually just speeds up JOINs – Barry Carter Jul 20 '22 at 12:35
  • Problem is that I can't build the right index that would speed up select. I didn't find a column index combination that wouldn't use temporary table. The temporary table seems to slow down the selection – MoreMeg Jul 20 '22 at 14:00

0 Answers0