0

I have three tables:

Shop_Table
   shop_id
   shop_name

Sells_Table
   shop_id
   item_id
   price

Item_Table
   item_id
   item_name

The Sells_Table links the item and shop tables via FK's to their ids. I am trying to get the most expensive item from each store, i.e., output of the form:

(shop_name, item_name, price)
(shop_name, item_name, price)
(shop_name, item_name, price)
(shop_name, item_name, price)
...

where price is the max price item for each shop. I can seem to achieve (shop_name, max(price)) but when I try to include the item_name I am getting multiple entries for the shop_name. My current method is

create view shop_sells_item as
select s.shop_name as shop, i.item_name as item, price
from Shop_Table s
join Sells_Table on (s.shop_id = Sells_Table.shop_id)
join Item_Table i on (i.item_id = Sells_Table.item_id);

select shop, item, max(price)
from shop_sells_item
group by shop;

However, I get an error saying that item must appear in the GROUP BY clause or be used in an aggregate function, but if I include it then I don't get the max price for each shop, instead I get the max price for each shop,item pair which is of no use.

Also, is using a view the best way? could it be done via a single query?

bqui56
  • 2,091
  • 8
  • 22
  • 40
  • 1
    You stated "I am trying to get the most expensive item from each store, i.e., output of the form." and then you said "where the price is min price item for each shop". So what are you exactly looking for, most expensive item of each shop or min price item for each shop. – Deepak Bhatia Sep 28 '13 at 10:52
  • Sorry about the min/max confusion, the query structure is the same regardless though so.. I edited my question to conform to max. – bqui56 Sep 28 '13 at 11:50

5 Answers5

2

You can do it Postgresql way:

select distinct on (shop_name) shop_name, item_name, price 
from shop_table
join sells_table using (shop_id) 
join item_table using (item_id)
order by shop_name, price;
Tomasz Myrta
  • 1,114
  • 8
  • 10
  • Was about to write an answer using window functions, but this is much cleaner. Possibly you could add item_name last in the order by list to make explicit what happens in situations where two or more items are "most expensive". – Petter Engström Sep 28 '13 at 13:11
  • For the record, this retrieves the *cheapest* item. – Erwin Brandstetter Oct 02 '13 at 00:29
1

I am trying to get the most expensive item from each store,

Definition of most expensive item for one shop := there is no item for this shop that is more expensive

SELECT *
FROM Sells_Table st
WHERE NOT EXISTS (
    SELECT * FROM Sells_Table nx
    WHERE nx.shop_id = st.shop_id
    AND nx.price > st.price
);

Now we have the sells_table entries that correspond to the items you want: the skeleton of the result set. Just add the meat :

SELECT sh.shop_name -- <<-- Meat
    , it.item_name  -- <<-- Meat
    , st.price
FROM Sells_Table st
JOIN shop_table sh ON sh.shop_id = st.shop_id  -- <<-- Meat
JOIN items_table it ON it.item_id = st.item_id -- <<-- Meat
WHERE NOT EXISTS (
    SELECT * FROM Sells_Table nx
    WHERE nx.shop_id = st.shop_id
    AND nx.price > st.price
);
wildplasser
  • 43,142
  • 8
  • 66
  • 109
1

Please note that the query below doesn't deal with a situation where multiple items in one store have the same maximum price (they are all the most expensive ones):

SELECT
    s.shop_name,
    i.item_name,
    si.price
FROM
    Sells_Table si
JOIN
    Shop_Table s
ON
    si.shop_id = s.shop_id
JOIN
    Item_Table i
ON
    si.item_id = i.item_id
WHERE
    (shop_id, price) IN (
        SELECT
            shop_id,
            MAX(price) AS price_max
        FROM
            Sells_Table
        GROUP BY
            shop_id
    );
miazo
  • 144
  • 7
1

Probably fastest:

SELECT s.shop_name, i.item_name, sub.price
FROM  (
   SELECT DISTINCT ON (shop_id) shop_id, item_id, price 
   FROM   sells_table
   ORDER  BY shop_id, price DESC -- , item_id -- for 1 item only
   ) sub
JOIN   item_table i USING (item_id)
JOIN   shop_table s USING (shop_id);
  • To get the most expensive item from each store you have to ORDER BY .. price DESC.

  • The above query gives you all items for a shop that share the maximum price. Can be more than one. If you always want a single item, you have to define how to break ties. For example, uncomment the additional ORDER BY item above to pick the one with minimum item_id.

  • It's generally faster to aggregate (or select) first and then join to additional tables which are not relevant for the aggregation or selection itself.

Details for DISTINCT ON in this related answer:
Select first row in each GROUP BY group?

You can easily create a VIEW from the above query, performance is about the same as from raw SQL.

Index for top performance

If you want that fast, try a multi-column index like the following:

CREATE INDEX sells_table_special_idx
ON sells_table (shop_id, price DESC, item_id)

Note the descending order on the second column.
The third column only serves to make it a covering index, which only works for PostgreSQL 9.2 or later.

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • BTW: IMHO my `NOT EXISTS` version will be equally fast or maybe even faster than your aggregating subquery. In most cases they would produce equivalent plans. – wildplasser Sep 28 '13 at 14:55
  • @wildplasser: You are right. `NOT EXISTS` is a hot contender for the performance crown. OP needs to test with `EXPLAIN ANALYZE`. – Erwin Brandstetter Sep 30 '13 at 22:45
  • You are not going to skin cats (or mules) again, are you ;-? – wildplasser Sep 30 '13 at 22:59
  • @wildplasser: The poor cats! ;) And it's a lot of work to do it properly. If you feel merciless and have time on your hands, why don't you give it a shot? Either way, it's probably faster to exclude rows first and join to more tables later with the NOT EXISTS variant as well. Importance of multi-column index should go without saying, but I'll add that to my answer ... – Erwin Brandstetter Sep 30 '13 at 23:04
  • It is not worth the wile anyway. The OP is not very cooperative either. Just a bit of intuition, and an occasional check of query plans will do in this case. IMHO. SAVE THE CATS!!11! – wildplasser Sep 30 '13 at 23:12
0

Your question sounds a bit weird, but let me see if I can rephrase it.

For each store, I want to find the most expensive thing they sell. From that Item you want the name of it, its price and the store selling it. However, you also state you are using MIN() for the minimum price... Which is it, highest or lowest price. Either way, just change the min() vs max()

Start first with just the stores and the max price (or min) item they sell -- regardless of what it it

select shop_id, max(price) grouped by the shop_id only... Use this as the basis of then getting the item(s) that match this max price... A shop could have 10 items at $199 and you have nothing identifying you only want 1 and the criteria to handle that. I am getting the

select
      FindingMax.Shop_id,
      shop.shop_name,
      FindingMax.HighestPrice,
      Item_Table.item_name
   from
      ( select st.shop_id,
               max( st.price ) as HighestPrice
           from
               sells_table st
           group by
               st.shop_id ) as FindingMax
      JOIN Shop_Table shop
         on FindingMax.shop_id = shop.shop_id
      JOIN sells_table st2
         on FindingMax.shop_id = st2.shop_id
         AND FindingMax.HighestPrice = st2.price
         JOIN Item_Table
            on st2.item_id = Item_Table.item_id
DRapp
  • 47,638
  • 12
  • 72
  • 142