0

I have a query with 2 INNER JOIN statements, and only fetching a few column, but it is very slow even though I have indexes on all required columns.

My query

SELECT 
  dysfonctionnement, 
  montant, 
  listRembArticles, 
  case when dys.reimputation is not null then dys.reimputation else dys.responsable end as responsable_final
FROM 
  db.commandes AS com
  INNER JOIN db.dysfonctionnements AS dys ON com.id_commande = dys.id_commande
  INNER JOIN db.pe AS pe ON com.code_pe = pe.pe_id
WHERE 
  com.prestataireLAD REGEXP '.*' 
  AND pe_nom REGEXP 'bordeaux|chambéry-annecy|grenoble|lyon|marseille|metz|montpellier|nancy|nice|nimes|rouen|strasbourg|toulon|toulouse|vitry|vitry bis 1|vitry bis 2|vlg' 
  AND com.date_livraison BETWEEN '2022-06-11 00:00:00' 
  AND '2022-07-08 00:00:00';

It takes around 20 seconds to compute and fetch 4123 rows.

The problem

In order to find what's wrong and why is it so slow, I've used the EXPLAIN statement, here is the output:

| id | select_type | table | partitions | type   | possible_keys              | key         | key_len | ref                    | rows   | filtered | Extra       |
|----|-------------|-------|------------|--------|----------------------------|-------------|---------|------------------------|--------|----------|-------------|
|  1 | SIMPLE      | dys   |            | ALL    | id_commande,id_commande_2  |             |         |                        | 878588 |   100.00 | Using where |
|  1 | SIMPLE      | com   |            | eq_ref | id_commande,date_livraison | id_commande | 110     | db.dys.id_commande     |      1 |     7.14 | Using where |
|  1 | SIMPLE      | pe    |            | ref    | pe_id                      | pe_id       | 5       | db.com.code_pe         |      1 |   100.00 | Using where |

I can see that the dysfonctionnements JOIN is rigged, and doesn't use a key even though it could...

Table definitions

commandes (included relevant columns only)

CREATE TABLE `commandes` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `id_commande` varchar(36) NOT NULL DEFAULT '',
  `date_commande` datetime NOT NULL,
  `date_livraison` datetime NOT NULL,
  `code_pe` int(11) NOT NULL,
  `traitement_dysfonctionnement` tinyint(4) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `id_commande` (`id_commande`),
  KEY `date_livraison` (`date_livraison`),
  KEY `traitement_dysfonctionnement` (`traitement_dysfonctionnement`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

dysfonctionnements (again, relevant columns only)

CREATE TABLE `dysfonctionnements` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `id_commande` varchar(36) DEFAULT NULL,
  `dysfonctionnement` varchar(150) DEFAULT NULL,
  `responsable` varchar(50) DEFAULT NULL,
  `reimputation` varchar(50) DEFAULT NULL,
  `montant` float DEFAULT NULL,
  `listRembArticles` text,
  PRIMARY KEY (`id`),
  UNIQUE KEY `id_commande` (`id_commande`,`dysfonctionnement`),
  KEY `id_commande_2` (`id_commande`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

pe (again, relevant columns only)

CREATE TABLE `pe` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `pe_id` int(11) DEFAULT NULL,
  `pe_nom` varchar(30) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `pe_nom` (`pe_nom`),
  KEY `pe_id` (`pe_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

Investigation

If I remove the db.pe table from the query and the WHERE clause on pe_nom, the query takes 1.7 seconds to fetch 7k rows, and with the EXPLAIN statement, I can see it is using keys as I expect it to do:

| id | select_type | table | partitions | type  | possible_keys              | key            | key_len | ref                    | rows   | filtered | Extra                                         |
|----|-------------|-------|------------|-------|----------------------------|----------------|---------|------------------------|--------|----------|-----------------------------------------------|
|  1 | SIMPLE      | com   |            | range | id_commande,date_livraison | date_livraison | 5       |                        | 389558 |   100.00 | Using index condition; Using where; Using MRR |
|  1 | SIMPLE      | dys   |            | ref   | id_commande,id_commande_2  | id_commande_2  | 111     | ooshop.com.id_commande |      1 |   100.00 |                                               |

I'm open to any suggestions, I see no reason not to use the key when it does on a very similar query and it definitely makes it faster...

Gugu72
  • 2,052
  • 13
  • 35
  • [How to Optimize Queries in a Database - The Basics](https://stackoverflow.com/q/3191623/3404097) [What are your most common sql optimizations?](https://stackoverflow.com/q/1332778/3404097) [When and why are database joins expensive?](https://stackoverflow.com/q/173726/3404097) [Tips for asking a good SQL question](https://meta.stackoverflow.com/q/271055/3404097) [Asking query performance questions](https://dba.meta.stackexchange.com/q/3034/43932) – philipxy Jul 08 '22 at 20:58

2 Answers2

1

I had a similar experience when MySQL optimiser selected a joined table sequence far from optimal. At that time I used MySQL specific STRAIGHT_JOIN operator to overcome default optimiser behaviour. In your case I would try this:

SELECT 
  dysfonctionnement, 
  montant, 
  listRembArticles, 
  case when dys.reimputation is not null then dys.reimputation else dys.responsable end as responsable_final
FROM 
  db.commandes AS com
  STRAIGHT_JOIN db.dysfonctionnements AS dys ON com.id_commande = dys.id_commande
  INNER JOIN db.pe AS pe ON com.code_pe = pe.pe_id

Also, in your WHERE clause one of the REGEXP probably might be changed to IN operator, I assume it can use index.

slkorolev
  • 5,883
  • 1
  • 29
  • 32
  • Wow, that is an unbelievable thing, I didn't know this join type and it's basically an INNER JOIN but stops sql doing weird stuff ig.. I'm currently redesigning queries and the database so REGEXP is definitely gonna be changed to IN, and another one can be entirely removed as it doesn't have any useful purpose... – Gugu72 Jul 08 '22 at 20:19
0

Remove com.prestataireLAD REGEXP '.*'. The Optimizer probably won't realize that this has no impact on the resultset. If you are dynamically building the WHERE clause, then eliminate anything else you can.

id_commande_2 is redundant. In queries where it might be useful, the UNIQUE can take care of it.

These indexes might help:

com:  INDEX(date_livraison, id_commande,  code_pe)
pe:  INDEX(pe_nom, pe_id)
Rick James
  • 135,179
  • 13
  • 127
  • 222
  • As for the REGEXP, it's dynamic building that I'm redoing, but in this case it doesn't affect the query time. Can `id_commande` being UNIQUE take over as an INDEX on `id_commande` column even though this UNIQUE index is coupled with another column?? – Gugu72 Jul 09 '22 at 07:28
  • After testing with those 2 indexes and removing the REGEXP you mentioned, even though the query is sped up a bit, still no index is used at all on the `dysfonctionnements` table, so the request is still about 15s long. – Gugu72 Jul 09 '22 at 12:48