-1

I've got three tables:

Lessons:

CREATE TABLE lessons (
    id SERIAL PRIMARY KEY,
    title text NOT NULL,
    description text NOT NULL,
    vocab_count integer NOT NULL
);

+----+------------+------------------+-------------+
| id |   title    |   description    | vocab_count |
+----+------------+------------------+-------------+
|  1 | lesson_one | this is a lesson |           3 |
|  2 | lesson_two | another lesson   |           2 |
+----+------------+------------------+-------------+

Lesson_vocabulary:

CREATE TABLE lesson_vocabulary (
    lesson_id integer REFERENCES lessons(id),
    vocabulary_id integer REFERENCES vocabulary(id)
);

+-----------+---------------+
| lesson_id | vocabulary_id |
+-----------+---------------+
|         1 |             1 |
|         1 |             2 |
|         1 |             3 |
|         2 |             2 |
|         2 |             4 |
+-----------+---------------+

Vocabulary:

CREATE TABLE vocabulary (
    id integer PRIMARY KEY,
    hiragana text NOT NULL,
    reading text NOT NULL,
    meaning text[] NOT NULL
);

Each lesson contains multiple vocabulary, and each vocabulary can be included in multiple lessons.

How can I get the vocab_count column of the lessons table to be calculated and updated whenevr I add more rows to the lesson_vocabulary table. Is this possible, and how would I go about doing this?

Thanks

  • 1
    You need a trigger for that. But in general you shouldn't store information in a relational database that can be derived from existing data. Why not create a view that returns that count? –  Apr 09 '20 at 09:59
  • Right, I see your point. So I shouldn't create a column for this at all, and just create a view which displays all the linked vocabulary for each lesson? – brycelikesdogs Apr 09 '20 at 10:05
  • Also, so if I want to display the lesson information including the vocabulary_count of each lesson on a page. Would I first fetch all of the rows in the lessons table, and then run a query for each one to calculate the vocabulary_count myself? Is there a way I can do it in one step? – brycelikesdogs Apr 09 '20 at 10:09

3 Answers3

0

You can use SQL triggers to serve your purpose. This would be similar to mysql after insert trigger which updates another table's column.

The trigger would look somewhat like this. I am using Oracle SQL, but there would just be minor tweaks for any other implementation.

CREATE TRIGGER vocab_trigger
AFTER INSERT ON lesson_vocabulary
FOR EACH ROW 
begin
   for lesson_cur in (select LESSON_ID, COUNT(VOCABULARY_ID) voc_cnt from LESSON_VOCABULARY group by LESSON_ID) LOOP
   update LESSONS
       set VOCAB_COUNT = LESSON_CUR.VOC_CNT
       where id = LESSON_CUR.LESSON_ID;
  end loop; 
END;
0

It's better to create a view that calculates that (and get rid of the column in the lessons table):

select l.*, lv.vocab_count
from lessons l 
  left join (
    select lesson_id, count(*)
    from lesson_vocabulary
    group by lesson_id
  ) as lv(lesson_id, vocab_count) on l.id = lv.lesson_id

If you really want to update the lessons table each time the lesson_vocabulary changes, you can run an UPDATE statement like this in a trigger:

update lessons l
  set vocab_count = t.cnt
from (
  select lesson_id, count(*) as cnt
  from lesson_vocabulary
  group by lesson_id
) t
where t.lesson_id = l.id;
0

I would recommend using a query for this information:

select l.*,
       (select count(*)
        from lesson_vocabulary lv
        where lv.lesson_id = l.lesson_id
       ) as vocabulary_cnt
from lessons l;

With an index on lesson_vocabulary(lesson_id), this should be quite fast.

I recommend this over an update, because the data remains correct.

I recommend this over a trigger, because it is simpler.

I recommend this over a subquery with aggregation because it should be faster, particularly if you are filtering on the lessons table.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • In my experience the join against the derived table isn't really slower and filter conditions are happily pushed down by Postgres. The co-related subquery tends to be slower if you do that for many rows though. The more rows (from the lesson table) it has to process the bigger the difference gets –  Apr 09 '20 at 11:54
  • @a_horse_with_no_name . . . Not if you are selecting all the rows in the first table. However, if the rows are filtered in the outer query, then I don't think Postgres pushes the filtering to the aggregation subquery, so the aggregation is doing much more work. – Gordon Linoff Apr 09 '20 at 16:00
  • Postgres does push down the filter expression: [plan for derived table](https://explain.depesz.com/s/Z3f) [plan for scalar sub-query](https://explain.depesz.com/s/jA9C) (100.000 rows lessons, 1 million rows in lesson_vocabulary) –  Apr 09 '20 at 17:45