12

I have a table

'products' => ('product_id', 'name', 'description') 

and a table

'product_price' => ('product_price_id', 'product_id', 'price', 'date_updated')

I want to perform a query something like

SELECT `p`.*, `pp`.`price` 
FROM `products` `p` 
LEFT JOIN `product_price` `pp` ON `pp`.`product_id` = `p`.`product_id`
GROUP BY `p`.`product_id` 
ORDER BY `pp`.`date_updated` DESC

As you can probably guess the price changes often and I need to pull out the latest one. The trouble is I cannot work out how to order the LEFT JOINed table. I tried using some of the GROUP BY functions like MAX() but that would only pull out the column not the row.

Thanks.

Simon
  • 5,158
  • 8
  • 43
  • 65
  • Can you explain what you would like to happen? Do you want to order by the max of the group? Do you want to join only with the latest update? Perhaps some example data and the expected output for that data would help clarify your question. – Mark Byers Jan 02 '11 at 20:58
  • I have one product with 5 different prices which different 'date_updated' datetimes. I need to join and pull out the latest price. – Simon Jan 02 '11 at 22:47
  • This is a simplification of a much larger query. I cannot use an INNER JOIN because in the real query I join to many tables and use an IF to select the NOT NULL column. – Simon Jan 02 '11 at 22:48

4 Answers4

16

It appears that it is impossible to use an ORDER BY on a GROUP BY summarisation. My fundamental logic is flawed. I will need to run the following subquery.

SELECT `p`.*, `pp`.`price` FROM `products` `p` 
LEFT JOIN (
    SELECT `price` FROM `product_price` ORDER BY `date_updated` DESC
) `pp` 
ON `p`.`product_id` = `pp`.`product_id`
GROUP BY `p`.`product_id`;

This will take a performance hit but as it is the same subquery for each row it shouldn't be too bad.

Simon
  • 5,158
  • 8
  • 43
  • 65
  • I asked a similar question just now and got this response from @Gordon Linoff http://stackoverflow.com/a/18763815/570796 – superhero Sep 12 '13 at 13:08
2

You need to set aliases properly I think and also set what you are joining on:

SELECT p.*, pp.price 
FROM products AS p 
LEFT JOIN product_price AS pp
ON pp.product_id = p.product_id 
GROUP BY p.product_id 
ORDER BY pp.date_updated DESC
Matt Asbury
  • 5,644
  • 2
  • 21
  • 29
  • Thanks. I missed the ON part. I have edited my query. I believe the AS part is optional. – Simon Jan 02 '11 at 22:50
-1

This will give you the last updated price:

select
  p.*, pp.price
from
  products p,
  -- left join this if products may not have an entry in prodcuts_price
  -- and you would like to see a null price with the product
  join 
  (
      select 
        product_price_id, 
        max(date_updated) 
      from products_price
      group by product_price_id
   ) as pp_max
    on p.product_id = pp.product_id
  join products_price pp on 
    pp_max.prodcuts_price_id = pp.products_price_id
nate c
  • 8,802
  • 2
  • 27
  • 28
  • This would probably work however I would really like to avoid doing a subquery as I have a large dataset. – Simon Jan 02 '11 at 22:49
  • Thats way the way that gives correct results. Other queries will give **all** prices. But hey if they work faster just use that right ...?@! – nate c Jan 02 '11 at 23:12
  • I appreciate your help but my query doesn't give all prices, it just gives me a random price which I hope to resolve by ordering correctly. – Simon Jan 02 '11 at 23:27
  • group by strategy is wrong and will fail. ordering happens after rows are selected. If speed is paramount denormalize your table and add a boolean price_current field into products_price that keeps track of what one you want. If you dont believe me, repost this again where it is more explicit with what you want. – nate c Jan 02 '11 at 23:42
  • 1
    @Simon, even with analytics (if only MySQL supported them) -- you'd still need a subquery. – OMG Ponies Jan 02 '11 at 23:58
  • 1
    what's the output of this? `(select product_price_id, max(date_updated) from products_price)` will undownvote if the logic/intent is clear. looks like wrong answer – Hao Jan 03 '11 at 00:47
  • The output is the most recent id and date that the price is updated. Whats unclear??? seems obvious. There are many prices. Only one is current. That current one is the most recent one. We need to know this to get current pricing. Once we have that id we can join the full table. There is no way to do this without a separate aggregate. – nate c Jan 03 '11 at 01:19
  • on which group MAX is evaluating its aggregation? why there is no GROUP BY? – Hao Jan 03 '11 at 02:33
  • and your sub-query returns one row only, each product's price is updated on different dates. why only return one row? there's something wrong with your query's logic – Hao Jan 03 '11 at 02:52
  • see my answer, how i simplify your answer (at the last part) http://stackoverflow.com/questions/4580479/mysql-left-join-group-by-and-order-by-not-working-as-required/4581849#4581849 – Hao Jan 03 '11 at 03:58
-2

Mysqlism:

SELECT p.*, MAX(pp.date_updated), pp.price 
FROM products p 
LEFT JOIN product_price pp ON pp.product_id = p.product_id
GROUP BY p.product_id 

Will work on some RDBMS:

SELECT p.*, pp.date_updated, pp.price 
FROM products p 
LEFT JOIN product_price pp ON pp.product_id = p.product_id
WHERE (p.product_id, pp.date_updated) 

   in (select product_id, max(date_updated) 
       from product_price 
       group by product_id)

Will work on most RDBMS:

SELECT p.*, pp.date_updated, pp.price 
FROM products p 
LEFT JOIN product_price pp ON pp.product_id = p.product_id

WHERE EXISTS
(
    select null -- inspired by Linq-to-SQL style :-)
    from product_price 

    WHERE product_id = p.product_id 

    group by product_id

    HAVING max(date_updated) = pp.date_updated
)

Will work on all RDBMS:

SELECT p.*, pp.date_updated, pp.price 
FROM products p 
LEFT JOIN product_price pp ON pp.product_id = p.product_id

LEFT JOIN 
(
    select product_id, max(date_updated) as recent 
    from product_price 
    group by product_id
) AS latest 
ON latest.product_id = p.product_id AND latest.recent = pp.date_updated

And if nate c's code intent is to just get one row from product_price, no need to table-derive (i.e. join (select product_price_id, max(date_updated) from products_price) as pp_max), he might as well just simplify(i.e. no need to use the product_price_id surrogate primary key) it like the following:

SELECT p.*, pp.date_updated, pp.price 
FROM products p 
LEFT JOIN product_price pp ON pp.product_id = p.product_id
WHERE pp.date_updated = (select max(date_updated) from product_price)
Hao
  • 8,047
  • 18
  • 63
  • 92
  • Your Mysqlism does not work. That is the whole the point of this post. The price that is selected does not **have to** although it may pick the price that coincides with the max date. http://dev.mysql.com/doc/refman/5.5/en/group-by-hidden-columns.html – nate c Jan 03 '11 at 21:22
  • The last query has nothing to do with my intent. – nate c Jan 03 '11 at 21:24
  • That's the point, u lack the GROUP BY clause in your subquery; so naturally it will pick up one row only. Then u add a GROUP BY on your subquery, using the product_price_id; I can hazard a guess that the product_price_id is a surrogate primary key, and what's more, it's a sequential id, whatdyathink? Why use GROUP BY on unique on each row on one table? it's absurd. Regarding Mysqlism, I'm not sure of my answer on that, I don't have MySQL on my box, I thought MySQL GROUP BY has super powers, I thought it has Postgresql's DISTINCT ON functionality, your first answer lead me to that thinking :-/ – Hao Jan 04 '11 at 14:29
  • Maybe you should put MySQL on your box instead of just making up answers for MySQL questions! – nate c Jan 04 '11 at 19:36
  • I know MySQL before, and I know some of many(correct) MySQLism. And your reference on GROUP BY without other fields, I know it very well, and in fact, Postgresql is embracing it, with a twist, Postgresql do it in a controlled manner(Postgres is still disciplined) http://www.depesz.com/index.php/2010/08/08/waiting-for-9-1-recognize-functional-dependency-on-primary-keys/ – Hao Jan 04 '11 at 23:52
  • It's only in this answer I'm mislead by someone's answer, you said that your subquery return row**s**, based on that I assumed MySQL MAX aggregation **can** pick up its corresponding row, that MySQL MAX can somehow mimic Postgresql DISTINCT ON, basing on your answer it looks like it is. Then when I answer in the same manner(MySQLism), you said it's not correct – Hao Jan 04 '11 at 23:55