1

I have a MySQL table articles in which I have organized the content in the following hierarchy:

  1. Section
  2. Subject
  3. Chapter
  4. Post

Each item from the above is on a row having fields: id, parent, name etc.

The post row parent equals chapter row id, chapter row parent equals subject row id and subject row parent equals section row id.

I don't have taller hierarchy than above.

I need to select the list of subjects in the a given section along with the count of all children of each subject. The count is the sum of number of descendant chapters and their descendants which are posts.

My brother helped me with the following select query. However it is relatively slow at ~0.6 seconds.

SELECT
    subjects.id,
    subjects.name,
    subjects.link,
    (
    SELECT
        COUNT(DISTINCT posts.id)
    FROM
        articles AS chapters,
        articles AS posts
    WHERE
        chapters.parent = subjects.id AND(
            posts.parent = chapters.id OR posts.parent = subjects.id
        )
    ) AS child_count
FROM
    articles AS subjects
WHERE
    subjects.parent = 62

I need help with improving the performance please.

Thanks a lot!!

Irfanullah Jan
  • 3,336
  • 4
  • 24
  • 34
  • Possible duplicate of [How to create a MySQL hierarchical recursive query](https://stackoverflow.com/questions/20215744/how-to-create-a-mysql-hierarchical-recursive-query) – Madhur Bhaiya Jul 06 '19 at 04:00
  • 1
    If it is slow, then you probably forgot to create an **index** on `parent`. – Andreas Jul 06 '19 at 04:01
  • You said *"post row `parent` equals chapter row `id`"*, so why `posts.parent = subjects.id`? That will never map anything, since a subject cannot be the parent of a post. – Andreas Jul 06 '19 at 04:03
  • wow just added an index on parent column. and it now processes in 0.02 seconds @Andreas thanks !! – Irfanullah Jan Jul 06 '19 at 04:07
  • Regarding that `posts.parent = subjects.id` I think we had to add that because otherwise it only counted posts. I needed to count chapter rows as well. Even though it says `posts.parent` it is actually the chapter row because the parent is subject. Weird how the query works lol @Andreas – Irfanullah Jan Jul 06 '19 at 04:12
  • 1
    Sad that it is being voted to close because this question is different from the one posted as possible duplicate. I am concerned with the count of children. – Irfanullah Jan Jul 06 '19 at 04:16

1 Answers1

3

This logic is hard to follow, but I think you intend:

SELECT s.id, s.name, s.link,
       ( (SELECT COUNT(*)  -- count children
          FROM articles c
          WHERE c.parent = s.id
         ) +
         (SELECT COUNT(*)  -- count grandchildren
          FROM articles c JOIN
               articles p
               ON p.parent = c.id
          WHERE c.parent = s.id
         )
    ) as child_count
FROM articles s
WHERE s.parent = 62;

Then for this query, you want an index on articles(parent).

Notes:

  • Never use commas in the FROM clause.
  • Always use proper, explicit, standard JOIN syntax. Your brother needs to learn how to write correct SQL as well.
  • COUNT(DISTINCT) can be more expensive than just COUNT().
  • ORs in the correlation or ON clause can impede the optimizer.
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786