Using a PostgreSQL database:
I have a survey application where users enter activities and answer questions about their activities. The survey itself is called RECALLS_T
, entered events are EVENTS_T
, and answers are ANSWERS_T
. Answers are for provided activity questions, stored in ACTIVITY_QUESTIONS_T
, mapped by Lookup (LOOKUP_T
).
Then I need to run an event-based report where every row is an event from EVENTS_T
for every recall (all events combined for all recalls). However, some columns in that report need to indicate a value for certain answers, otherwise those cells are NULL. So this is a tabulated report.
Example (easy flat stuff first, then the complicated tabulated stuff):
RecallID | RecallDate | Event |..| WalkAlone | WalkWithPartner |..| ExerciseAtGym
256 | 10-01-19 | Exrcs |..| NULL | NULL |..| yes
256 | 10-01-19 | Walk |..| yes | NULL |..| NULL
256 | 10-01-19 | Eat |..| NULL | NULL |..| NULL
257 | 10-01-19 | Exrcs |..| NULL | NULL |..| yes
My SQL has inner selects for the tabulated answer-based columns and looks like this:
select
-- Easy flat stuff first
r.id as recallid, r.recall_date as recalldate, ... ,
-- Example of Tabulated Columns:
(select l.description from answers_t ans, activity_questions_t aq, lookup_t l
where l.id=aq.answer_choice_id and aq.question_id=13
and aq.id=ans.activity_question_id and aq.activity_id=27 and ans.event_id=e.id)
as transportationotherintensity,
(select l.description from answers_t ans, activity_questions_t aq, lookup_t l
where l.id=66 and l.id=aq.answer_choice_id and aq.question_id=14
and aq.id=ans.activity_question_id and ans.event_id=e.id)
as commutework,
(select l.description from answers_t ans, activity_questions_t aq, lookup_t l
where l.id=67 and l.id=aq.answer_choice_id and aq.question_id=14 and aq.id=ans.activity_question_id and ans.event_id=e.id)
as commuteschool,
(select l.description from answers_t ans, activity_questions_t aq, lookup_t l
where l.id=95 and l.id=aq.answer_choice_id and aq.question_id=14 and aq.id=ans.activity_question_id and ans.event_id=e.id)
as dropoffpickup,
The SQL works and the report gets rendered, but the performance is bad. I verified that it's proportionally bad: there's no magic bullet for a specific item that would fix it. Every inner select contributes to the bad performance. The result set of 1,000 rows takes 15 seconds, but should take no more than 2-3 seconds.
Note that these indexes already exist:
ANSWERS_T
: onACTIVITY_QUESTION_ID
,EVENT_ID
EVENTS_T
: onRECALL_ID
ACTIVITY_QUESTIONS_T
: onACTIVITY_ID
,QUESTION_ID
,ANSWER_CHOICE_ID
Is there something I'm doing wrong with these inner selects?