0

I have a table Products which looks like the following:

+-----------+-----------+----------+
|ProductCode|ProductType|   ....   |
+-----------+-----------+----------+
|   ref01   |   BOOKS   |   ....   |
|   ref02   |   ALBUMS  |   ....   |
|   ref06   |   BOOKS   |   ....   |
|   ref04   |   BOOKS   |   ....   |
|   ref07   |   ALBUMS  |   ....   |
|   ref10   |   TOYS    |   ....   |
|   ref13   |   TOYS    |   ....   |
|   ref09   |   ALBUMS  |   ....   |
|   ref29   |   TOYS    |   ....   |
|   .....   |   .....   |   ....   |
+-----------+-----------+----------+

Another table Sales which looks like the following:

+-----------+-----------+----------+
|ProductCode|   Orders  |   ....   |
+-----------+-----------+----------+
|   ref01   |     15    |   ....   |
|   ref02   |     12    |   ....   |
|   ref06   |     20    |   ....   |
|   ref04   |     14    |   ....   |
|   ref07   |     11    |   ....   |
|   ref10   |     19    |   ....   |
|   ref13   |      3    |   ....   |
|   ref09   |      9    |   ....   |
|   ref29   |      5    |   ....   |
|   .....   |   .....   |   ....   |
+-----------+-----------+----------+

I am trying to find the products that were ordered more than the average of all other products of the same type.

By manually calculating, the result would be something like:

+-----------+-----------+----------+
|ProductCode|   Orders  |   ....   |
+-----------+-----------+----------+
|   ref02   |     12    |   ....   |
|   ref06   |     20    |   ....   |
|   ref07   |     11    |   ....   |
|   ref10   |     19    |   ....   |
|   .....   |   .....   |   ....   |
+-----------+-----------+----------+

So if looking in the type ALBUMS and product ref02, then I need to find the average of Orders of ALL OTHER ALBUMS. In this case, it is the average of ref06 and ref04, but there are more in the actual table. So what I need to do is the following:

Since product ref02 is 'ALBUMS', and ref07 and ref09 are also 'ALBUMS'. 
      So their average is (11+9)/2=10 <12.

Since product ref06 is 'BOOKS', and **ref01** and ref04 are also 'BOOKS'.                     
      So their average is (15+14)/2=14.5 <20.

Since product ref07 is 'ALBUMS', and **ref02** and ref09 are also 'ALBUMS'.           
      So their average is (12+9)/2=10.5<11.

Since product ref10 is 'TOYS', and ref13 and ref29 are also 'TOYS'           
      So their average is (3+5)/2=4<19.

The rest does not satisfy the condition thus will not be in the result.

I know how to and was able to find the average of orders for all products under the same type, but I have no idea how to find the average of orders for all other products under the same type.

I am using PostgreSQL, but cannot use any of these key words: WITH, OVER, LIMIT, PARTITION.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Hayul Kim
  • 11
  • 2
  • 1
    Why would you ***not*** want to use a superior solution with a window function? – Erwin Brandstetter Nov 11 '21 at 05:30
  • If that is an exercise or homework, it's a pretty stupid one that forbids using [modern SQL](https://modern-sql.com/) –  Nov 11 '21 at 07:53
  • @ErwinBrandstetter I’m sorry for the trouble but it is an exercise, and it is not that I don’t want to use them, it is that I’m not allowed to. – Hayul Kim Nov 11 '21 at 09:01
  • @a_horse_with_no_name I don’t disagree, but that’s the one I’m working with… :( Any ideas though? – Hayul Kim Nov 11 '21 at 09:03

2 Answers2

0

We can try using AVG() as an analytic function here:

WITH cte AS (
    SELECT p.ProductCode, p.ProductType, s.Orders,
           AVG(s.Orders) OVER (PARTITION BY p.ProductType) AS AvgOrders
    FROM Products p
    INNER JOIN Sales s ON s.ProductCode = p.ProductCode
)

SELECT ProductCode, ProductType, Orders
FROM cte
WHERE Orders > AvgOrders;

screen capture from demo link below

Demo

Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
  • First of all, thank you so much for your help! It is really my fault but your answer does not really work for me. I mean it works fine, but there is this requirement that I am not allowed to use WITH. I should have mentioned it earlier but I forgot. I'm sorry about it. Unless WITH is not a critical part? Would you mind if you explain it a bit more? I am not really experienced with it. :) – Hayul Kim Nov 11 '21 at 02:02
  • Then just inline the CTE. Replace `FROM cte` in the bottom query with `FROM (SELECT p.ProductCode, p.ProductType, ... ) t` ... check [the updated demo here](https://dbfiddle.uk/?rdbms=postgres_9.6&fiddle=312744a90a700adf159e4827bd7d6eb3). – Tim Biegeleisen Nov 11 '21 at 02:04
  • Thanks for the update! But I am afraid this doesn't work either as OVER and PARTITION are also not allowed... :( – Hayul Kim Nov 11 '21 at 02:06
  • Sorry for the trouble but... Any other ideas of approach? – Hayul Kim Nov 11 '21 at 02:36
  • 1
    @HayulKim Sorry, I have already answered your question _twice_, and have provided working demos for those two versions. I'm afraid I'm not able to offer any more help at the moment. – Tim Biegeleisen Nov 11 '21 at 02:41
  • Alright, I understand. Anyways, I appreciate your help! :) – Hayul Kim Nov 11 '21 at 02:51
0

This is what I would use in Postgres 11 or later:
A solution with a window function using a custom frame definition:

SELECT product_code, orders
FROM  (
   SELECT product_code, s.orders
        , avg(orders) OVER (PARTITION BY p.product_type
                            ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
                            EXCLUDE CURRENT ROW) AS avg_orders
   FROM   product p
   JOIN   sales s USING (product_code)
   ) sub
WHERE  orders > avg_orders 
ORDER  BY product_code;  -- optional

I expect this to perform best.
Read the manual for details on window functions.

This is what you are asking for:
A solution without CTE, window function, or LIMIT (works in any modern Postgres version):

SELECT product_code, s.orders
FROM   product p
JOIN   sales s USING (product_code)
JOIN   LATERAL (
   SELECT avg(orders) AS avg_orders
   FROM   product p1
   JOIN   sales s1 USING (product_code)
   WHERE  p1.product_type =  p.product_type
   AND    p1.product_code <> p.product_code
   ) a ON a.avg_orders < s.orders
ORDER  BY product_code;  -- optional

Either produces your desired result.

db<>fiddle here

About LATERAL:

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • Thanks! It works! But just as an addition to the problem, what if the ProductCode appears multiple times and has to use its sum to compare. Where would that modification happen in the latter solution? :) – Hayul Kim Nov 11 '21 at 13:14
  • @Hayul: Please ask a new *question*. Comments are not the place, and you have changed this one too often already. (Shouldn't change the question at all after answers have been given.) You can always link to this one for context. – Erwin Brandstetter Nov 11 '21 at 13:53
  • I'm sorry, will do that. It is just I thought I can work out the rest if I know what to do with the most difficult bit. As it turns out, I've overestimated my abilities... :( – Hayul Kim Nov 11 '21 at 17:01
  • @HayulKim: So is *this* question answered properly? https://stackoverflow.com/help/someone-answers – Erwin Brandstetter Nov 11 '21 at 23:45