0

like i said in the title i have a SQL query that joins many tables and uses CASE WHEN to return the progress of the projects

SELECT
p.id_project
CASE
WHEN po.id is not null then 'payement commpleted'
WHEN b.id is not null then 'bill recieved'
WHEN e.id is not null then 'project engaged'
(and still many other cases)
ELSE 'start of the project'
END as progress
FROM project p
left join decision d on d.id_project=p.id_project
LEFT JOIN engagement e on e.id_project=p.id_project
LEFT JOIN bill b on b.id_project=p.id_project
LEFT JOIN payment_order po on po.id_project=p.id_project
LEFT JOIN ..... (many other tables)

needless to say that not all joins are simple joins and not all the cases are simple cases, in fact most of them are complicated, the query takes about 1 sec to execute and when i remove the CASE column entirely it goes down to about .3 sec, the 1 sec query is still too much as it needs to be called in other queries. now i don't know if i can optimize the query more or not or if there is a better approach but any suggestions would be appreciated.

Alex Poole
  • 183,384
  • 11
  • 179
  • 318
imstuckaf
  • 87
  • 8
  • The execution plan might give you some clues; see [How to describe performance issue in relational database?](https://stackoverflow.com/a/34975420/266304) ad the [query-optimization] [tag wiki](https://stackoverflow.com/tags/query-optimization/info). The speed-up from removing the case expression is probably just the optimiser saving time by not having to retrieve the IDs from all the other tables. The execution plan might tell you where the rest of the time is being spent. – Alex Poole Nov 08 '22 at 15:15
  • 1
    You are partly cross joining tables, by only joining on the project ID. For a *single* project with 10 decisions, 10 engagements, 10 bills and 10 payment orders you create 10 x 10 x 10 x 10 = 10000 rows. Get your join criteria straight. – Thorsten Kettner Nov 08 '22 at 15:21
  • @ThorstenKettner i don't know what you mean exactly but i think i already did that, most of my joins are like this : ``` left join ( select o.id_offre_f, bec.id, row_number() over (partition by o.id_offre_f order by bec.date_e desc) as seq from bec inner join o on bec.id_offre_f=o.id_offre_f where bec.type=1 ) bec on o.id_offre_f=bec.id_offre_f and bec.seq=1 ``` – imstuckaf Nov 09 '22 at 12:13
  • @ThorstenKettner plus the problem is not really with the JOINs, it is with the CASEs – imstuckaf Nov 09 '22 at 12:17
  • Okay, the cross join problem is with the joins in your example, but not with your real joins then. In your join you are selecting the table's latest entry, which is typically done with an `OUTER APPLY` instead, but using `ROW_NUMBER` does the same job. Are you using this table's columns somewhere in your real query or are you just using the ID in your `CASE` expression to check whether a row exists? That could explain the difference in speed. Without the `CASE` the DBMS might then not read the table at all. – Thorsten Kettner Nov 09 '22 at 12:53

1 Answers1

0

Based on your knowledge of the data, are there values that will come up more often?

For instance, are there more 'project engaged' than 'payment completed'? If so, it would speed up a bit if you re-arranged the case statement so that higher probable matches come first. This would prevent travelling down through case statement sections unneccessarily.

dewitte
  • 1
  • 1
  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Nov 12 '22 at 19:42