5

The goal here is to: 1. Fetch the row with the most recent date from EACH store for EACH ingredient. 2. From this result, compare the prices to find the cheapest store for EACH ingredient.

I can accomplish either the first or second goal in separate queries, but not in the same. How can i filter out a selection and then apply another filter on the previous result?

EDIT: I've been having problems with results that i get from MAX and MIN since it just fetches the rest of the data arbitrarily. To avoid this im supposed to join tables on multiple columns (i guess). Im not sure how this will work with duplicate dates etc.

I've included an image of a query and its output data.

enter image description here

If we use ingredient1 as an example, it exists in three separate stores (in one store twice on different dates).

In this case the cheapest current price for ingredient1 would be store3. If the fourth row dated 2013-05-25 was even cheaper, it would still not "win" due to it being out of date. (Disregard brandname, they dont really matter in this problem.)

Would appreciate any help/input you can offer!

raecer
  • 195
  • 4
  • 14
  • Can you post the table declares (preferably as something that can be cut and pasted) with some example data please – Kickstart Mar 20 '14 at 11:54

3 Answers3

1

This probably needs a couple of sub queries joined together.

This isn't tested (as I don't have your table definitions, nor any test data), but something like this:-

SELECT i.name AS ingredient,
        ip.price,
        ip.date,
        s.name AS storename,
        b.name AS brandname
FROM ingredient i
INNER JOIN ingredient_price ip
ON ingredient.ingredient_id = ingredient_price.ingredient_id
INNER JOIN store_to_ingredient_price stip
ON ingredient_price.ingredient_price_id = store_to_ingredient_price.ingredient_price_id
INNER JOIN store s 
ON store_to_ingredient_price.store_id = store.store_id
INNER JOIN brand_to_ingredient_price btip
ON ingredient_price.ingredient_price_id = brand_to_ingredient_price.ingredient_price_id
INNER JOIN brand b
ON brand_to_ingredient_price.brand_id = brand.brand_id  
INNER JOIN  
(
    SELECT i.ingredient_id,
            stip.store_id,
            ip.date,
            MIN(ip.price) AS lowest_price
    FROM ingredient i
    INNER JOIN ingredient_price ip
    ON ingredient.ingredient_id = ingredient_price.ingredient_id
    INNER JOIN store_to_ingredient_price stip
    ON ingredient_price.ingredient_price_id = store_to_ingredient_price.ingredient_price_id
    INNER JOIN
    (
        SELECT i.ingredient_id,
                stip.store_id,
                MAX(ip.date) AS latest_date
        FROM ingredient i
        INNER JOIN ingredient_price ip
        ON ingredient.ingredient_id = ingredient_price.ingredient_id
        INNER JOIN store_to_ingredient_price stip
        ON ingredient_price.ingredient_price_id = store_to_ingredient_price.ingredient_price_id
        GROUP BY ingredient_id, store_id
    ) Sub1
    ON i.ingredient_id = Sub1.ingredient_id
    AND stip.store_id = Sub1.store_id
    AND ip.date = Sub1.latest_date
    GROUP BY i.ingredient_id, stip.store_id, ip.date
) Sub2
ON  i.ingredient_id = Sub2.ingredient_id
AND stip.store_id = Sub2.store_id
AND ip.date = Sub2.date
AND ip.price = Sub2.lowest_price
Kickstart
  • 21,403
  • 2
  • 21
  • 33
1

This question is really interesting!

So, first, we get the row with the most recent date from EACH store for EACH ingredient. (It is possible that the most recent dates from each store can be different.) Then, we compare the prices from each store (regardless of the date) to find the least price for each ingredient.

The query below uses the GROUP_CONCAT function in good measure. Here's a SO question regarding the use of the function.

SELECT
   i.name as ingredient_name
  , MIN(store_price.price) as price
  , SUBSTRING_INDEX(
    GROUP_CONCAT(store_price.date ORDER BY store_price.price),
    ',',
    1
    ) as date
  , SUBSTRING_INDEX(
    GROUP_CONCAT(s.name ORDER BY store_price.price),
    ',',
    1
    ) as store_name
  , SUBSTRING_INDEX(
    GROUP_CONCAT(b.name ORDER BY store_price.price),
    ',',
    1
    ) as brand_name
FROM
  ingredient i
JOIN
(SELECT
  ip.ingredient_id as ingredient_id
  , stip.store_id as store_id
  , btip.brand_id as brand_id
  , CONVERT(SUBSTRING_INDEX(
    GROUP_CONCAT(ip.ingredient_price_id ORDER BY ip.date DESC),
    ',',
    1
    ), UNSIGNED INTEGER) as ingredient_price_id
  , MAX(ip.date) as date
  , CONVERT(SUBSTRING_INDEX(
    GROUP_CONCAT(ip.price ORDER BY ip.date DESC),
    ',',
    1
    ), DECIMAL(5,2)) as price
FROM ingredient_price ip
JOIN store_to_ingredient_price stip ON ip.ingredient_price_id = stip.ingredient_price_id
JOIN brand_to_ingredient_price btip ON ip.ingredient_price_id = btip.ingredient_price_id
GROUP BY 
  ip.ingredient_id
  , stip.store_id) store_price
ON i.ingredient_id = store_price.ingredient_id
JOIN store s ON s.store_id = store_price.store_id
JOIN brand b ON b.brand_id = store_price.brand_id
GROUP BY
  store_price.ingredient_id;

You can check the implementation on this SQL Fiddle.

The version below, which ignores the brand, is slightly smaller:

SELECT
   i.name as ingredient_name
  , MIN(store_price.price) as price
  , SUBSTRING_INDEX(
    GROUP_CONCAT(store_price.date ORDER BY store_price.price),
    ',',
    1
    ) as date
  , SUBSTRING_INDEX(
    GROUP_CONCAT(s.name ORDER BY store_price.price),
    ',',
    1
    ) as store_name
FROM
  ingredient i
JOIN
(SELECT
  ip.ingredient_id as ingredient_id
  , stip.store_id as store_id
  , CONVERT(SUBSTRING_INDEX(
    GROUP_CONCAT(ip.ingredient_price_id ORDER BY ip.date DESC),
    ',',
    1
    ), UNSIGNED INTEGER) as ingredient_price_id
  , MAX(ip.date) as date
  , CONVERT(SUBSTRING_INDEX(
    GROUP_CONCAT(ip.price ORDER BY ip.date DESC),
    ',',
    1
    ), DECIMAL(5,2)) as price
FROM ingredient_price ip
JOIN store_to_ingredient_price stip ON ip.ingredient_price_id = stip.ingredient_price_id
GROUP BY 
  ip.ingredient_id
  , stip.store_id) store_price
ON i.ingredient_id = store_price.ingredient_id
JOIN store s ON s.store_id = store_price.store_id
GROUP BY
  store_price.ingredient_id;

References: Simulating First/Last aggregate functions in MySQL

Community
  • 1
  • 1
Joseph B
  • 5,519
  • 1
  • 15
  • 19
0

Try this:

SELECT  `newest`.ingredient, `newest`.store,
        `newest`.brand, `newest`.price, `newest`.`latest_date`
FROM
        (SELECT         ingredient.name AS ingredient, store.name AS store,
                        brand.name AS brand, ingredient_price.price,
                        MAX( ingredient_price.date ) AS `latest_date`
        FROM            ingredient
        LEFT OUTER JOIN ingredient_price
            ON ingredient.ingredient_id = ingredient_price.ingredient_id
        LEFT OUTER JOIN store_to_ingredient_price
            ON ingredient_price.ingredient_price_id = store_to_ingredient_price.ingredient_price_id
        LEFT OUTER JOIN store
            ON store_to_ingredient_price.store_id = store.store_id
        LEFT OUTER JOIN brand_to_ingredient_price
            ON ingredient_price.ingredient_price_id = brand_to_ingredient_price.ingredient_price_id
        LEFT OUTER JOIN brand
            ON brand_to_ingredient_price.brand_id = brand.brand_id
        GROUP BY ingredient.name) `newest`

ORDER BY `newest`.price
LIMIT 1
Itay
  • 16,601
  • 2
  • 51
  • 72
  • Im not that good with queries but this returns the latest dates, for each ingredient (not for each store), then shows the cheapest ingredient (not the cheapest store per ingredient). I may have been too vague in my first question. I'll try adding more info above. – raecer Sep 10 '13 at 13:33