1

I've got a few tables in a Postgres 12.3 database.

The first one is named category:

id|template_id|name           |entry_count|is_base_template|can_rename|can_delete|section|userId|parentCategoryId|
--|-----------|---------------|-----------|----------------|----------|----------|-------|------|----------------|
 1|           |Notes          |          0|true            |true      |true      |A      |      |                |
 2|           |ToDo           |          0|true            |true      |true      |A      |      |                |
 3|          1|Notes          |          0|false           |true      |true      |A      |     1|                |
 4|          2|ToDo           |          0|false           |true      |true      |A      |     1|                |
 5|           |Must Do        |          0|false           |          |          |A      |      |               4|
 6|           |Important notes|          0|false           |          |          |A      |      |               3|

The second table is called entry - which has no bearing on the question at hand.

And there is the link table category_entries_entry:

categoryId|entryId|
----------|-------|
         4|      1|
         5|      5|
         5|      6|
         4|      7|
         3|      8|
         6|      9|

A category can possess children, if parentCategoryId is not NULL then we're dealing with a child. For instance, the category with id = 5 is a subcategory of id = 4. Children can't have their own children, so only one level of nesting.

I need to count the number of entries for each category including subcategories.

This request does what I need, mostly. But it does not take into account the user:

SELECT COALESCE(c."parentCategoryId" , c.id) as cat_id , COUNT(*) as entries_in_cat
FROM category c JOIN
     category_entries_entry r
     ON c.id = r."categoryId" 
WHERE c.is_base_template = false
GROUP BY cat_id;

Returns:

cat_id|entries_in_cat|
------|--------------|
     4|             4|
     3|             2|

The category table has also userId and the count should only be executed for a given user. Notably, only root categories have entries for userId.

And I want to list subcategories with their counts additionally. So the desired output with userId = 1 for the given sample is:

cat_id|entries_in_cat|
------|--------------|
     5|             2|
     4|             4|
     6|             1|
     3|             2|

Here's a break down:

1) Category number 6 is a subcategory of the 3rd category it has 1 entry, so the result is correct.

2) Category number 3 is a category (that's to say, it does not have a parent), it contains 1 entry and another one should come from the 6th subcategory, that's 2 in total. Your script returns 1 which is wrong.

3) Category number 5 is a subcategory of the 4th category, it contains 2 entries. Your script returns also 2 which is right.

4) category number 4 is a category, it has 2 entries of its own and another two come from the 5th subcategory, that's 4 in total. Your script returns 2 which is wrong. It should return 4.

How can I achieve that?

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Albert
  • 2,146
  • 10
  • 32
  • 54
  • In the sample data the `userId` = `1` is specified for categories `3` and `4`. If we choose this user, should the subcategories `5` and `6` be also included? They don't have the same user. – The Impaler Jun 12 '20 at 19:15
  • @TheImpaler yeah because the subcategories 5 and 6 belong to the same user. It's evident from their `parentCategoryId`. Their corresponsing parent category ids are 4 and 3, so yeah they belong to the same user. – Albert Jun 12 '20 at 19:21
  • 1
    To focus the question, I took the liberty to copyedit extensively. Please fix if I got anything wrong. – Erwin Brandstetter Jun 14 '20 at 23:05
  • @ErwinBrandstetter I've just checked it, yeah everything looks great, now it's become more clear! :) – Albert Jun 14 '20 at 23:14

2 Answers2

1

Since the filtering applies to the parent categories only you'll need to apply it there first: then you can get the subcategories. Once you have the parent categories and the subcategories, you can join them to the entries and count them.

In sum, the query should look like:

with
c as (
  SELECT id
  FROM category
  WHERE userId = 1 AND is_base_template = false
),
s as (
  SELECT d.id
  FROM c
  JOIN category d on d.parentCategoryId = c.id
)
SELECT u.id, count(*) as entries_in_cat
FROM (select id from c union select id from s) u
JOIN category_entries_entry r ON u.id = r."categoryId" 
GROUP BY u.id
The Impaler
  • 45,731
  • 9
  • 39
  • 76
  • Thank you. It throws an error at the moment, it says: `ERROR: missing FROM-clause entry for table "c"`. – Albert Jun 12 '20 at 19:52
  • 1
    @Albert Fixed... See https://www.db-fiddle.com/f/koMxJCwn8iovXZXHy5oNQf/0 – The Impaler Jun 12 '20 at 19:56
  • Hmm it works but still does not return what I need =( First of all, it's great if it returns counts for subcategories, BUT and this is important, the counts for categories should also include the numbers for subcategories. For example, the fourth category contains two its own entries but another two come from its subcategories, so the total should be four, not 2. Is it achievable? I would greatly appreciate if you would modify the script so that it would work this way. Thank you. – Albert Jun 12 '20 at 20:21
  • @Albert Please add the example to DBFiddle so I can run it and test it. Also please provide the exact expected result. – The Impaler Jun 12 '20 at 20:52
  • I extremely appologise for the delay, I was extremely exhausted this Friday...now returning to all this. I've edited my question, could you please take a look. I'm not very good with fiddle, but I hope I was clear enough in that edit I made. – Albert Jun 14 '20 at 20:16
1

This does the job for a single level of nesting:

To list only root categories, counts include subcategories:

WITH root AS (
   SELECT id AS cat_id, id AS sub_id
   FROM   category
   WHERE  is_base_template = false
   AND    "userId" = 1
   )
SELECT c.cat_id, count(*)::int AS entries_in_cat
FROM  (
   TABLE root
   UNION ALL
   SELECT r.cat_id, c.id
   FROM   root     r
   JOIN   category c ON c."parentCategoryId" = r.cat_id
   ) c
JOIN   category_entries_entry e ON e."categoryId" = c.sub_id
GROUP  BY c.cat_id;

The point is to join on sub_id, but group by cat_id.

To list root categories like above, and subcategories additionally:

WITH root AS (
   SELECT id AS cat_id, id AS sub_id
   FROM   category
   WHERE  is_base_template = false
   AND    "userId" = 1
   )
, ct AS (
   SELECT c.cat_id, c.sub_id, count(*)::int AS ct
   FROM  (
      TABLE root
      UNION ALL
      SELECT r.cat_id, c.id AS sub_id
      FROM   root     r
      JOIN   category c ON c."parentCategoryId" = r.cat_id
      ) c
   JOIN   category_entries_entry e ON e."categoryId" = c.sub_id
   GROUP  BY c.cat_id, c.sub_id
   )
SELECT cat_id, sum(ct)::int AS entries_in_cat
FROM   ct
GROUP  BY 1

UNION ALL
SELECT sub_id, ct
FROM   ct
WHERE  cat_id <> sub_id;

db<>fiddle here

For an arbitrary number of nesting levels, use a recursive CTE. Example:

About the optional short syntax TABLE parent:

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • Sorry for the delay with the reply, I was half dead this Friday...the script does what I need but is it possible to add counts for subcategories (like in the EDIT I made in the question)? If not , then I will have to count them with a separate query. – Albert Jun 14 '20 at 20:21
  • 1
    @Albert: I added another solution to list subcategories additionally. – Erwin Brandstetter Jun 14 '20 at 22:45
  • Thank you ever so much! – Albert Jun 14 '20 at 22:54