1

I have an order_lines table with order_id and extended_price column. I want to know the orders that have the sum of the extended price higher than the average of the sum of the extended price of all the orders. This is the query I got:

SELECT order_id, SUM(extended_price) AS "sumtotal"
FROM order_lines e
GROUP BY order_id
HAVING SUM(extended_price) > 
  (SELECT AVG(c.sumtotal) AS "avgtotal"
  FROM
    (SELECT order_id, SUM(extended_price) AS "sumtotal"
    FROM order_lines
    GROUP BY order_id) c
  )
ORDER BY sumtotal       

As we can see I have a subquery c to get the sumtotal which is used to calculate the avgtotal. But I'm running the same query as my main query to calculate the sumtotal again and compare against the avgtotal. Is there any better way to do it using only standard SQL features. I'm using PostgreSQL.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Saiful Shubho
  • 93
  • 4
  • 11

1 Answers1

1

One way would be to run a window function over the aggregate function in a subquery:

SELECT order_id, sumtotal
FROM  (
   SELECT order_id
        , SUM(extended_price) AS sumtotal
        , AVG(SUM(extended_price)) OVER () AS avgtotal
   FROM   order_lines
   GROUP  BY order_id
   ) sub
WHERE  sumtotal > avgtotal;

Should be faster.

Or use a CTE to avoid repeated evaluation. It adds some cost for the materialization of the interim result, though.

WITH cte AS (
   SELECT order_id, SUM(extended_price) AS sumtotal
   FROM   order_lines
   GROUP  BY order_id
   )
SELECT order_id, sumtotal
FROM   cte
WHERE  sumtotal > (SELECT avg(sumtotal) FROM cte);

You might use another CTE for the average for clarity, but subqueries are generally cheaper.

Related:

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228