2

I’m fairly proficient at SQL, however this question has had myself stumped for quite a while now. In the most basic sense, there are simply two tables:

Items
+----+--------+
| id | title  |
+----+--------+
|  1 | socks  |
|  2 | banana |
|  3 | watch  |
|  4 | box    |
|  5 | shoe   |
+----+--------+

...and the prices table:

Prices
+---------+-----------+-------+------------+
| item_id | vendor_id | price | created_at |
+---------+-----------+-------+------------+
|       1 |         1 | 5.99  | Today      |
|       1 |         2 | 4.99  | Today      |
|       2 |         1 | 6.99  | Today      |
|       2 |         2 | 6.99  | Today      |
|       1 |         1 | 3.99  | Yesterday  |
|       1 |         1 | 4.99  | Yesterday  |
|       2 |         1 | 6.99  | Yesterday  |
|       2 |         2 | 6.99  | Yesterday  |
+---------+-----------+-------+------------+

(Please note: created_at is actually a timestamp, the words “Today” and “Yesterday” were provided merely to quickly convey the concept).

My goal is to get a simple result back containing the inventory item associated with the most recent, lowest price, including the reference to the vendor_id who is providing said price.

However, I find the stumbling block appears to be the sheer number of requirements for the statement (or statements) to handle:

  • Each item has multiple vendors, so we need to determine which price between all the vendors for each item is the lowest
  • New prices for the items get appended regularly, therefore we only want to consider the most recent price for each item for each vendor
  • We want to roll all that up into a single result, one item per row which includes the item, price and vendor

It seems simple, but I’ve found this problem to be uncanningly difficult.

As a note, I’m using Postgres, so all the fanciness it provides is available for use (ie: window functions).

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
James Lai
  • 2,051
  • 17
  • 14

3 Answers3

4

Much simpler with DISTINCT ON in Postgres:

Current price per item for each vendor

SELECT DISTINCT ON (p.item_id, p.vendor_id)
       i.title, p.price, p.vendor_id
FROM   prices p
JOIN   items  i ON i.id = p.item_id
ORDER  BY p.item_id, p.vendor_id, p.created_at DESC;

Optimal vendor for each item

SELECT DISTINCT ON (item_id) 
       i.title, p.price, p.vendor_id -- add more columns as you need
FROM (
   SELECT DISTINCT ON (item_id, vendor_id)
          item_id, price, vendor_id -- add more columns as you need
   FROM   prices p
   ORDER  BY item_id, vendor_id, created_at DESC
   ) p
JOIN   items i ON i.id = p.item_id
ORDER  BY item_id, price;

->SQLfiddle demo

Detailed explanation:
Select first row in each GROUP BY group?

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • 1
    This shows each vendor's current price. I believe the OP also wants to know the optimal vendor for each item. That can be done by turning the answer into a subquery and then doing `SELECT DISTINCT ON(item_id) * FROM the_subquery ORDER BY price`, yes? – Andrew Lazarus Aug 23 '13 at 20:07
  • OMG SQLFiddle is a thing! Anyhow, Andrew is right, I wouldn't want multiple results of socks, only a single result for each item. – James Lai Aug 23 '13 at 20:09
  • @AndrewLazarus: Right, the second step was missing. Added that. `ORDER BY` needs to match `DISTINCT ON`, though. So we need `ORDER BY item_id, price`. – Erwin Brandstetter Aug 23 '13 at 20:16
  • @ErwinBrandstetter: Even Homer nods. – Andrew Lazarus Aug 23 '13 at 20:19
0

Try this

CREATE TABLE #Prices ( Iid INT, Vid INT, Price Money, Created DateTime)
INSERT INTO #Prices 
SELECT 1, 1, 5.99 ,GETDATE()    UNION
SELECT 1, 2, 4.99 ,GETDATE()    UNION
SELECT 2, 1, 6.99 ,GETDATE()    UNION
SELECT 2, 2, 6.99 ,GETDATE()    UNION
SELECT 1, 1, 3.99 ,GETDATE()-1  UNION
SELECT 1, 2, 4.99 ,GETDATE()-1  UNION
SELECT 2, 1, 6.99 ,GETDATE()-1  UNION
SELECT 2, 2, 6.99 ,GETDATE()-1 

WITH CTE AS
(
    SELECT 
        MyPriority = ROW_NUMBER() OVER ( partition by Iid, Vid ORDER BY Created DESC, Price ASC) 
    ,   Iid
    ,   Vid
    ,   price
    ,   Created
    FROM #Prices 
)

SELECT * FROM CTE WHERE MyPriority = 1
Chris Rodriguez
  • 888
  • 7
  • 16
  • Very close - however, this results in an item_id being returned for each vendor_id. What would be the best way to get that down to a single item_id? – James Lai Aug 23 '13 at 19:57
0

It's also possible to do this with windowed functions, it will work on SQL Server version > 2005:

with cte1 as (
    select
        *,
        row_number() over(partition by vendor_id, item_id order by created_at desc) as row_num
    from prices
), cte2 as (
    select
        *,
        row_number() over(partition by item_id order by price asc) as row_num2
    from cte1
    where row_num = 1
)
select i.title, c.price, c.vendor_id
from cte2 as c
    inner join items as i on i.id = c.item_id
where c.row_num2 = 1;

sql fiddle demo(Thanks Erwin)

Roman Pekar
  • 107,110
  • 28
  • 195
  • 197