I have a table for categories with a nested set model. Each row should contain its count of sub-categories and how much articles are in those or '0' if there aren't any.
I've searched arround and found two possible solutions but nothing of them works:
MySQL & nested set: slow JOIN (not using index)
Why isn't MySQL using any of these possible keys?
Create Table categories:
CREATE TABLE `categories` (
`GROUP_ID` varchar(255) CHARACTER SET utf8 NOT NULL,
`GROUP_NAME` varchar(255) CHARACTER SET utf8 NOT NULL,
`PARENT_ID` varchar(255) CHARACTER SET utf8 NOT NULL,
`TYPE` enum('root','node','leaf') CHARACTER SET utf8 NOT NULL DEFAULT 'node',
`LEVEL` tinyint(2) NOT NULL DEFAULT '0',
`GROUP_ORDER` int(11) NOT NULL,
`GROUP_DESCRIPTION` text CHARACTER SET utf8 NOT NULL,
`total_articles` int(11) unsigned NOT NULL DEFAULT '0',
`total_cats` int(11) unsigned NOT NULL DEFAULT '0',
`lft` smallint(5) unsigned NOT NULL DEFAULT '0',
`rgt` smallint(5) unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (`GROUP_ID`),
KEY `PARENT_ID` (`PARENT_ID`),
KEY `lft` (`lft`),
KEY `rgt` (`rgt`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
total_cats
is the amount of sub-categories in the rows tree.
The following query will do exactly what i want: all sub-category and article counts. But it is very slow. It takes more than 80 seconds to perform on ~5000 categories and ~40000 articles. The calculation of total_articles
is already done by another script. (If there arent any articles, all rows should hold 0
for total_articles
)
The Query:
SELECT a.GROUP_ID,a.PARENT_ID,COUNT(b.GROUP_ID) as total_cats,(
SELECT SUM(c.total_articles)
FROM categories c
WHERE c.PARENT_ID = a.GROUP_ID) as total_articles
FROM categories as b
INNER JOIN categories as a
ON a.lft < b.lft AND a.rgt > b.rgt
GROUP BY a.GROUP_ID
It results in something like this:
+-------------------------------------------+-------------------------------------+------------+----------------+
| GROUP_ID | PARENT_ID | total_cats | total_articles |
+-------------------------------------------+-------------------------------------+------------+----------------+
| 69_69_1 | 69_69_0 | 4252 | 0 |
| 69_69_Abfall__Wertstoffsammler___zubehoer | 69_69_NWEAB290h001 | 5 | 20 |
| 69_69_Abisolierzangen | 69_69_NWAAA458h001 | 4 | 56 |
| 69_69_Abzieher_2 | 69_69_NWAAB944h001 | 23 | 476 |
| 69_69_Abziehvorrichtung | 69_69_Abzieher_2 | 3 | 18 |
| 69_69_Aexte | 69_69_NWEAA615h001 | 6 | 45 |
| 69_69_Alarmgeraete_Melder | 69_69_Sicherungstechnik__Heimschutz | 3 | 4 |
| 69_69_Allgemeiner_Industriebedarf | 69_69_Industrieausruestung | 8 | 21 |
| 69_69_Allgemeines_Schweisszubehoer | 69_69_NWEAB113h001 | 27 | 97 |
| 69_69_Anker__Befestigungstechnik__1 | 69_69_Befestigungstechnik | 5 | 163 |
The explain if it helps:
+----+--------------------+-------+------+---------------+-----------+---------+------+------+------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------+------+---------------+-----------+---------+------+------+------------------------------------------------+
| 1 | PRIMARY | b | ALL | lft,rgt | NULL | NULL | NULL | 4253 | Using temporary; Using filesort |
| 1 | PRIMARY | a | ALL | lft,rgt | NULL | NULL | NULL | 4253 | Range checked for each record (index map: 0xC) |
| 2 | DEPENDENT SUBQUERY | c | ref | PARENT_ID | PARENT_ID | 767 | func | 7 | NULL |
+----+--------------------+-------+------+---------------+-----------+---------+------+------+------------------------------------------------+
As you can see, it doesnt use the indexes. If i put FORCE INDEX (lft,rgt)
next to the JOIN
the query executes, but nothing changes. Also tried to add an index on both columns lft and right:
ALTER TABLE `categories` ADD INDEX `nestedset` (`lft`, `rgt`);
But that doesnt help at all. The query still is slow.
Interestingly: The query is pretty fast if the categories table is just filled with a small amount of rows e.g. 260. But if it reaches 1000+ it will become slower and slower.
Example data with ~4000 categories: http://pastebin.com/BsViwFM5 its a big file!
Thanks for any help and hints!