I'm working with a data warehouse doing report generation. As the name would suggest, I have a LOT of data. One of the queries that pulls a LOT of data is getting to take longer than I like (these aren't performed ad-hoc, these queries run every night and rebuild tables to cache the reports).
I'm looking at optimizing it, but I'm a little limited on what I can do. I have one query that's written along the lines of...
SELECT column1, column2,... columnN, (subQuery1), (subquery2)... and so on.
The problem is, the sub queries are repeated a fair amount because each statement has a case around them such as...
SELECT
column1
, column2
, columnN
, (SELECT
CASE
WHEN (subQuery1) > 0 AND (subquery2) > 0
THEN CAST((subQuery1)/(subquery2) AS decimal)*100
ELSE 0
END) AS "longWastefulQueryResults"
Our data comes from multiple sources and there are occasional data entry errors, so this prevents potential errors when dividing by a zero. The problem is, the sub-queries can repeat multiple times even though the values won't change. I'm sure there's a better way to do it...
I'd love something like what you see below, but I get errors about needing sq1 and sq2 in my group by clause. I'd provide an exact sample, but it'd be painfully tedious to go over.
SELECT
column1
, column2
, columnN
, (subQuery1) as sq1
, (subquery2) as sq2
, (SELECT
CASE
WHEN (sq1) > 0 AND (sq2) > 0
THEN CAST((sq1)/(sq2) AS decimal)*100
ELSE 0
END) AS "lessWastefulQueryResults"
I'm using Postgres 9.3 but haven't been able to get a successful test yet. Is there anything I can do to optimize my query?