11

I have two tables (country & ducks) where the country table has every country in the world and the ducks table has a list of ducks with a country_id field to link to the main country.

I'm trying to get a list of only countries with at least one duck in it and with that a single matching record from the ducks table for the highest rated duck within that country. So far I have:

SELECT *
FROM country c 
INNER JOIN ducks d ON c.id = d.country_id
ORDER BY c.country ASC, d.rating DESC

This returns a list of every duck rather than just one per country.

I'd be grateful if anyone can point me in the right direction here. I'd rather do it in SQL than have a separate query for each country to pull out the top rated duck.

hakre
  • 193,403
  • 52
  • 435
  • 836
Al_
  • 2,479
  • 7
  • 29
  • 43

5 Answers5

21
SELECT c.*, d.*
FROM country c 
  INNER JOIN ducks d 
    ON d.id =                         --- guessing the ducks Primary Key here
       ( SELECT dd.id                 --- and here  
         FROM ducks dd
         WHERE c.id = dd.country_id
         ORDER BY dd.rating DESC
         LIMIT 1
       )

An index on (country_id, rating, id) for MyISAM table or (country_id, rating) for InnoDB table, would help.

This query will show only one duck per country, even with more than one having the same rating. If you want ducks with tied rating to appear, use @imm's GROUP BY answer.

ypercubeᵀᴹ
  • 113,259
  • 19
  • 174
  • 235
8

You could try just adding a selecting join, for

SELECT c.*, d.*
FROM country c 
INNER JOIN ducks d ON c.id = d.country_id
LEFT JOIN ducks d2 ON d.country_id = d2.country_id AND d2.rating > d.rating
WHERE d2.id IS NULL
Naltharial
  • 2,132
  • 14
  • 21
  • 1
    And it's better to have `SELECT c.*, d.*`. No need to include the `d2.*` columns - which will be all NULL. – ypercubeᵀᴹ Mar 05 '12 at 08:25
  • It's fun to remove subquery, but looks like this version is horrible slow and memory consuming if `ducks` table large enough. Because it operates on count(ducks)*count(ducks) set – zxcat Aug 23 '15 at 11:58
  • fix: it operates on `count(ducks)*(count(ducks)-1)` sets (not whole `ducks`, but per each `country_id`). Ok, it's not so horrible when separated by `country_id`. But @imm solution should be more efficient if proper indexes used – zxcat Aug 23 '15 at 12:11
  • @zxcat: That's not where the complexity of this query is at all. The operation to group and compare ratings if far more expensive that just looking at the space you're querying. A subquery is almost never faster than a `JOIN`, especially in this case where the latter uses an `eq_ref` comparison to the subquery's `index` (in the ideal case). Run an `EXPLAIN SELECT` on the queries and compare the execution plan. – Naltharial Aug 24 '15 at 13:02
3

You might try:

SELECT c.*, d.*
FROM country c
INNER JOIN (
    SELECT d.country_id, d.id, MAX(d.rating) AS rating
    FROM ducks d
    GROUP BY d.country_id
) q ON (q.country_id = c.id)

INNER JOIN ducks d 
    ON (d.country_id, d.rating) = (q.country_id, q.rating)
ypercubeᵀᴹ
  • 113,259
  • 19
  • 174
  • 235
imm
  • 5,837
  • 1
  • 26
  • 32
  • This only needs one more join to show ducks details: `INNER JOIN ducks d ON (d.country_id, d.rating) = (q.country_id, q.rating)` – ypercubeᵀᴹ Mar 04 '12 at 22:15
0

Try this:

SELECT c.country, MAX(d.rating) AS max_rating
FROM country c
JOIN ducks d ON c.id = d.country_id
GROUP BY c.id
ORDER BY c.country ASC

If the "highest rating" is 1, then change MAX(d.rating) to MIN(d.rating)

J. Bruni
  • 20,322
  • 12
  • 75
  • 92
  • Hi, that's very close as it returns a single country with the max rating, I'm trying to also return the record that corresponds to that maximum rating. eg if I change the first line of your code to `SELECT c.country, d.id, d.rating, MAX(d.rating)` then d.rating doesn't always equal MAX(d.rating) if that makes sense – Al_ Mar 04 '12 at 22:06
  • @ypercube: thanks. Nevermind... I can't test it here now. I wanted to make it happen without subqueries. Now OP has already resolved his case... – J. Bruni Mar 04 '12 at 22:20
-1

Many databases have some equivalent of "select top 10 * from...". In mySql, the syntax would be "select * from ... limit 10".

... BUT ...

In this case, uou really want "group by" and "max()"!

paulsm4
  • 114,292
  • 17
  • 138
  • 190
  • 1
    Hi, the group by limits the results, but the MAX() function just displays the maximum for the group, but the rest of the row contains the data for the first record found by MySQL rather than the one with the maximum rating. This is what I have so far: `SELECT c.*, d.*, MAX(d.rating) FROM country c INNER JOIN ducks d ON c.id = d.country_id GROUP BY c.country ORDER BY c.country ASC, d.rating DESC` – Al_ Mar 04 '12 at 22:03