21

What is the best way to sort a table like this:

CREATE TABLE category(
    id INT(10),
    parent_id INT(10),
    name VARCHAR(50)
);

INSERT INTO category (id, parent_id, name) VALUES
(1, 0, 'pizza'),        --node 1
(2, 0, 'burger'),       --node 2
(3, 0, 'coffee'),       --node 3
(4, 1, 'piperoni'),     --node 1.1
(5, 1, 'cheese'),       --node 1.2
(6, 1, 'vegetariana'),  --node 1.3
(7, 5, 'extra cheese'); --node 1.2.1

To sort it hierarchically by id or name:
'pizza' //node 1
'piperoni' //node 1.1
'cheese' //node 1.2
'extra cheese' //node 1.2.1
'vegetariana' //node 1.3
'burger' //node 2
'coffee' //node 3

EDIT: the number at the end of the name is to visualize the strucutre better, it is not for sorting.

EDIT 2: as mentioned several times ... the number at the end of the name "cheese 1.2" was only for visualization purpose, NOT for sorting. I moved them as comments, too many people got confused, sorry.

d.raev
  • 9,216
  • 8
  • 58
  • 79
  • 2
    Oracle has a way to do it with `START WITH parent_id = 0 CONNECT BY PRIOR id = parent_id ORDER SIBLINGS BY id ASC`. I think that MySQL does not have such hierarchical queries. – Benoit Feb 15 '13 at 07:53
  • @Benoit: actually nearly all DBMS *except* for a few (including MySQL) can do something like that using a recursive common table expression. –  Feb 15 '13 at 09:01
  • 1
    is the tabel structure already defined or are you in the planning phase and could choose another structure? how many entries do you plan to have in the table ? is it modified often or is it important to have many read access on it ? – t.niese Feb 19 '13 at 08:34
  • @t.niese more columns can be added (top_parent_id, depth_level ..) but I m looking for solution in this structure if possible. Depth can be added (tree is not limited to 3 levels) but in general there should not be more then 50-100 entries and speed for this specific query is not an issue. – d.raev Feb 19 '13 at 09:29
  • 1
    See [Managing Hierarchical Data in MySQL](http://mikehillyer.com/articles/managing-hierarchical-data-in-mysql/) for a number of recipes. – Barmar Feb 23 '13 at 10:37
  • You'd do well to review SO question I wrote about this topic: http://stackoverflow.com/questions/4048151/what-are-the-options-for-storing-hierarchical-data-in-a-relational-database – orangepips Feb 25 '13 at 20:43

8 Answers8

15

By adding a path column and a trigger, this can be done fairly easily.

First add a varchar column that will contain the path from root to the node:

ALTER TABLE category ADD path VARCHAR(50) NULL;

Then add a trigger that calculates the path on insert:

(simply concats the new id with path of the parent)

CREATE TRIGGER set_path BEFORE INSERT ON category
  FOR EACH ROW SET NEW.path = 
  CONCAT(IFNULL((select path from category where id = NEW.parent_id), '0'), '.', New.id);

Then simply select order by path:

SELECT name, path FROM category ORDER BY path;

Result:

pizza         0.1
piperoni      0.1.4
cheese        0.1.5
extra cheese  0.1.5.7
vegetariana   0.1.6
burger        0.2
coffee        0.3

See fiddle.

This way maintenance cost is also minimal. The path field is hidden when inserting and is calculated via trigger. Removing a node has no overhead, since all the children of the node are also removed. The only problem is when updating the parent_id of a node; Well, don't do that! :)

jurgenreza
  • 5,856
  • 2
  • 25
  • 37
13

Nested Tree Sets in combination with a level column is a really nice technique for reading and sorting tree based structures. It is easy to select a sub-tree, limit the result to certain level, and do sorting in one query. But the cost for inserting and deleting entires is relatively high, so you should use it if you query your data more often then you write them and where reading performance is important. (for 50-100 the time for removing, inserting or moving elements should be no problem, even with 1000 it should not be problematic).

With each entry you store it's level and value for left and right, in the sample below it is: (left,right,level) if you want to select only the 1.2 with it's descendants you would do:

 SELECT * FROM table WHERE left >=7 AND right <=16

if you would like to select only the children then

 SELECT * FROM table WHERE left >=7 AND right <=16 AND level=2

if you want to sort you could do

 SELECT * FROM table WHERE left >=7 AND right <=16 ORDER BY left

Sorting by other fields while keeping the grouping of the hierarchy could be problematic, depending on how you would like to sort.

                               1 (0,17,0)
                                   |
                                   |
                   +---------------+---------------------------------------+
                   |                                                       |
              1.1 (1,6,1)                                            1.2 (7,16,1)
                   |                                                       |
      +------------+-------+                  +-------------------+--------+----------------+
      |                    |                  |                   |                         |
  1.1.1 (2,3,2)      1.1.2 (4,5,2)      1.2.1 (8,9,2)       1.2.2 (10,13,2)         1.2.2 (14,15,2)
                                                                  |
                                                                  |
                                                                  |
                                                            1.2.2.1 (11,12,3)

Closure Table (for completion, but I would not recommend for your use case). It stores all paths in the tree and therefore the required storage space for the hierarchy will grow really fast if you have many levels.

Path Enumeration there you store the path of each element with the entry /0/, /0/1/ querying path is easy there, but for sorting it is not that flexible.

For a small amount of entires I would use Nested Tree Sets. sadly I don't have a good reference page that describes these techniques and compares them.

t.niese
  • 39,256
  • 9
  • 74
  • 101
10

if there is only 3 levels of nesting you can do something like that

SELECT c1.name FROM category as c1 LEFT JOIN category as c2
   ON c1.parent_id = c2.id OR (c1.parent_id = 0 AND c1.id = c2.id) 
   ORDER BY c2.parent_id, c2.id, c1.id; 

if you have more nesting levels it would be more tricky

for more nesting level you can write function

delimiter ~
DROP FUNCTION getPriority~

CREATE FUNCTION getPriority (inID INT) RETURNS VARCHAR(255) DETERMINISTIC
begin
  DECLARE gParentID INT DEFAULT 0;
  DECLARE gPriority VARCHAR(255) DEFAULT '';
  SET gPriority = inID;
  SELECT parent_id INTO gParentID FROM category WHERE ID = inID;
  WHILE gParentID > 0 DO
    SET gPriority = CONCAT(gParentID, '.', gPriority);
    SELECT parent_id INTO gParentID FROM category WHERE ID = gParentID;
  END WHILE;
  RETURN gPriority;
end~

delimiter ;

so i now on

SELECT * FROM category ORDER BY getPriority(ID);

i have

+------+-----------+--------------------+
| ID   | parent_id | name               |
+------+-----------+--------------------+
|    1 |         0 | pizza 1            |
|    4 |         1 | piperoni 1.1       |
|    5 |         1 | cheese 1.2         |
|    7 |         5 | extra cheese 1.2.1 |
|    6 |         1 | vegetariana 1.3    |
|    2 |         0 | burger 2           |
|    3 |         0 | coffee 3           |
+------+-----------+--------------------+
BenMorel
  • 34,448
  • 50
  • 182
  • 322
Solon
  • 362
  • 1
  • 13
  • 2levels is easy .. the example is with 3 as you can see. your query throws "'name' is ambiguous" and even if fixed to `c2.name` the order doesn't work – d.raev Feb 15 '13 at 08:24
  • 1
    sorry, `SELECT c1.name FROM category as c1 LEFT JOIN category as c2 ON c1.parent_id = c2.id OR (c1.parent_id = 0 AND c1.id = c2.id) ORDER BY c2.id, c1.id;` and i can see only 2 levels (extra cheese refernce to pizza(1), not to cheese(5)) – Solon Feb 15 '13 at 08:32
  • The function is a nice idea, although I would be worried about the performance. Good thinking anyway. – Kickstart Feb 19 '13 at 14:25
  • @Solon can you update the declaration to be executable and add a working example for it. I your solution is the current best answer to my question and will give you the bounty if you make it nice. – d.raev Feb 25 '13 at 12:02
  • 1
    @d.raev edited, it works, but my advice to you is to save priority in some column, you can calculate it by my function when you add new row – Solon Feb 26 '13 at 06:49
  • 1
    Why would a function impact performance? It seems like a great solution and a whole lot cleaner than other solutions out there but I want to know what it would do to performance. – Joao Mar 18 '15 at 21:50
3

One way is to have separate string field for storing the full-path of any node. You need to maintain this field on every insert/update/delete operation.

You can have field value like below

CREATE TABLE category(
    id INT(10),
    parent_id INT(10),
    name VARCHAR(50),
    path VARCHAR(255)
);

INSERT INTO category (id, parent_id, name, path) VALUES
(1, 0, 'pizza 1','|1|'),
(2, 0, 'burger 2','|2|'),
(3, 0, 'coffee 3','|3|'),
(4, 1, 'piperoni 1.1','|1||4|'),
(5, 1, 'cheese 1.2','|1||5|'),
(6, 1, 'vegetariana 1.3','|1||6|'),
(7, 5, 'extra cheese 1.2.1','|1||5||1|');

You need to order by path field to have tree in proper sorting order.

SELECT * FROM `category` ORDER BY `path`;

See SqlFiddle Demo

This way you do not need recursion in programming language to print the entire tree in correct sort order.

Note:

This example will only work if you have max ID upto 9, as |1||11| will come earlier than |1||2|

To resolve this issue, you need to do padding for building string based on maximum value of ID field is expected for your application, like below example with max value expected is 999 (3 digits)

|001||002|


As per my experience this solution should only be good to handle tree with depth up-to 7-8 level.

For Other method : Click Here

Minesh
  • 2,284
  • 1
  • 14
  • 22
  • if I add extra field... and have to maintain it.. it could simple be an `order` field or something like that. Thenks for the article it is easily done with recursive code too, I m looking for SQL variant on this basic strucutre. – d.raev Feb 15 '13 at 12:43
3

I think everyone is over-architect-ing the solution. If your goal is really represented by your example, as in 3-levels with the virtual top level of 0 id, this should suffice.

SELECT *
     , id AS SORT_KEY
  FROM category a
 WHERE parent_id = 0
UNION ALL
SELECT a.*
     , CONCAT(b.id, '.', a.id) AS SORT_KEY
  FROM category a
     , category b
 WHERE b.parent_id = 0
   and b.id = a.parent_id
UNION ALL
SELECT a.*
     , CONCAT(c.id,'.', b.id,'.', a.id) AS SORT_KEY
  FROM category a
     , category b
     , category c
 WHERE c.parent_id = 0
   and b.id = a.parent_id
   AND c.id = b.parent_id
ORDER BY sort_key
Robert Co
  • 1,715
  • 8
  • 14
2

Sql

WITH CTE_Category
    AS
    (
      SELECT id, parent_id, name
      , RIGHT(name,CHARINDEX(' ',REVERSE(RTRIM(name)))-1) as ordername
      FROM Category 
    )

    SELECT id, parent_id, name FROM CTE_Category ORDER BY ordername

MySql

SELECT id, parent_id, name
FROM Category ORDER BY SUBSTRING_INDEX(name,' ',-1)
Community
  • 1
  • 1
bvr
  • 4,786
  • 1
  • 20
  • 24
-1

Try ORDER BY name , id at the end of your SQL query.

This will sort by name and use id to settle any ties.

Oleksi
  • 12,947
  • 4
  • 56
  • 80
-1
SELECT * FROM category ORDER BY name, parent_id ASC
Naveen D Almeida
  • 877
  • 6
  • 16