97

I want to join two tables, but only get 1 record of table2 per record on table1

For example:

SELECT c.id, c.title, p.id AS product_id, p.title
FROM categories AS c
JOIN products AS p ON c.id = p.category_id

This would get me all records in products, which is not what I want. I want 1 [the first] product per category (I have a sort column in the products field).

How do I go about doing that?

John Davidson
  • 973
  • 1
  • 6
  • 4

12 Answers12

123

I like more another approach described in a similar question: https://stackoverflow.com/a/11885521/2215679

This approach is better especially in case if you need to show more than one field in SELECT. To avoid Error Code: 1241. Operand should contain 1 column(s) or double sub-select for each column.

For your situation the Query should looks like (this query also will work in PostgresQL and it is pretty fast, see my update below):

This is the fastest query. Use this one.

SELECT
 c.id,
 c.title,
 p.id AS product_id,
 p.title AS product_title
FROM categories AS c
JOIN products AS p ON
 p.id = (                                 --- the PRIMARY KEY
  SELECT p1.id FROM products AS p1
  WHERE c.id=p1.category_id
  ORDER BY p1.id LIMIT 1
 )

PS. I did the performance test of the query vs other proposed here, and this query is the best option yet!

UPDATE (2022-07-20, PostgresSQL)

I'm not working with mySQL for a while already, so, I decided to test the performance of my solution (which actually works perfect in both MySQL and PostgresQL) with solution provided by @Gravy in PostgresQL v.12.9.

For that I decided to create a dummy tables and data with 100 categories and 100000 products. You can check the code on this gist

I run my query above and it took only 13ms to run.

After I slightly modified (for postgres) the query from @Gravy:

This is slow, do not use it!

SELECT
  id,
  category_title,
  (array_agg(product_title))[1]  
FROM
    (SELECT c.id, c.title AS category_title, p.id AS product_id, p.title AS product_title
    FROM categories AS c
    JOIN products AS p ON c.id = p.category_id
    ORDER BY c.id ASC) AS a 
GROUP BY id, category_title;

and run it too. It took more than 150ms in my machine. Which is >10x times slower.

In defense of @gravy's solution, I agree with n+1 problem. But, in this particular case, usually the number of products is way larger than categories. So, running through each category is way less expensive than running through each product as in @Gravy's query.

By the way, if your table has 1mln products with 100 categories, the speed of my query is still the same (between 9-17ms), but the query from [@Gravy] takes more than 2 seconds to run

In resume, at this moment, my query is the most performant and optimal solution for the current task.

Feel free to comment.

Kostanos
  • 9,615
  • 4
  • 51
  • 65
34

Accepted answer by @goggin13 looks wrong. Other solutions provided to-date will work, but suffer from the n+1 problem and as such, suffer a performance hit.

n+1 problem: If there are 100 categories, then we would have to do 1 select to get the categories, then for each of the 100 categories returned, we would need to do a select to get the products in that category. So 101 SELECT queries would be performed.

My alternative solution solves the n+1 problem and consequently should be significantly more performant as only 2 selects are being performed.

SELECT
  *
FROM
    (SELECT c.id, c.title, p.id AS product_id, p.title
    FROM categories AS c
    JOIN products AS p ON c.id = p.category_id
    ORDER BY c.id ASC) AS a 
GROUP BY id;
Gravy
  • 12,264
  • 26
  • 124
  • 193
  • 3
    Interesting. Normally if you use ORDER BY and GROUP BY in a single query the GROUP BY will run first and ORDER BY will sort the groups. I take it that is the way to make ORDER BY go first. – Gherman Apr 05 '16 at 06:55
  • @German - That's correct. And that way, you can select `group_concat` ensuring that your results are in the correct order. – Gravy Apr 05 '16 at 09:44
  • Can anyone suggest a tweak to make this work with MySQL 5.7? It complains about non aggregated columns since ONLY_FULL_GROUP_BY is enabled by default. Thanks! – Chris Bartley Oct 03 '17 at 18:06
  • 2
    @ChrisBartley - Simply add the missing group by fields on any aggregated columns. You could also use `ANY_VALUE()` as per https://dev.mysql.com/doc/refman/5.7/en/group-by-handling.html – Gravy Oct 04 '17 at 18:18
  • Hey, thank you for your solution. I took a time, and wrote a speed test for both, yours and mine queries. As I was expected, you query is >10x times slower. The main reason is because usually there are much more products in the table than categories, and your query runs through products table instead of mine, that runs through the categories. Please see my answer, I updated it. – Kostanos Jul 20 '22 at 12:19
17
SELECT c.id, c.title, p.id AS product_id, p.title
FROM categories AS c
JOIN products AS p ON c.id = p.category_id
GROUP BY c.id

This will return the first data in products (equals limit 1)

Jessé Catrinck
  • 2,227
  • 19
  • 20
7

What about this?

SELECT c.id, c.title, (SELECT id from products AS p 
                            WHERE c.id = p.category_id 
                            ORDER BY ... 
                            LIMIT 1)
   FROM categories AS c;
Krab
  • 2,118
  • 12
  • 23
3

The With clause would do the trick. Something like this:

WITH SELECTION AS (SELECT id FROM products LIMIT 1)
SELECT a.id, c.id, c.title FROM selection a JOIN categories c ON (c.id = a.id);
la_kal
  • 31
  • 2
  • As a sidenote, this will not work for MySQL 5.7 or lower... "WITH ... AS" syntax is available in MySQL 8, however. – drapkin11 Aug 02 '21 at 22:23
  • No, it won't: https://www.db-fiddle.com/f/66RqT8qPrTfh9zmgCnqrrG/0 (schema taken from https://stackoverflow.com/q/11885394/1073003) – cl0ne May 09 '23 at 16:00
1
SELECT
 c.id,
 c.title,
 p.id AS product_id,
 p.title AS product_title
FROM categories AS c
JOIN products AS p ON
 p.id = (                               
  SELECT MIN(p1.id) FROM products AS p1
  WHERE c.id=p1.category_id
 )

Using MIN or MAX in a subquery will make your query run much faster.

0

Another example with 3 nested tables: 1/ User 2/ UserRoleCompanie 3/ Companie

  • 1 user has many UserRoleCompanie.
  • 1 UserRoleCompanie has 1 user and 1 Company
  • 1 Companie has many UserRoleCompanie
SELECT 
u.id as userId, 
u.firstName,
u.lastName,
u.email,
urc.id ,
urc.companieRole,
c.id as companieId,
c.name as companieName
FROM User as u 
JOIN UserRoleCompanie as urc ON u.id = urc.userId
    AND urc.id = (
        SELECT urc2.id
        FROM UserRoleCompanie urc2 
        JOIN Companie ON urc2.companieId = Companie.id
        AND urc2.userId = u.id 
        AND Companie.isPersonal = false
        order by Companie.createdAt DESC
        
        limit 1
    )
    
LEFT JOIN Companie as c ON urc.companieId = c.id
+---------------------------+-----------+--------------------+---------------------------+---------------------------+--------------+---------------------------+-------------------+
| userId                    | firstName | lastName           | email                     | id                        | companieRole | companieId                | companieName      |
+---------------------------+-----------+--------------------+---------------------------+---------------------------+--------------+---------------------------+-------------------+
| cjjt9s9iw037f0748raxmnnde | henry     | pierrot            | henry@gmail.com           | cjtuflye81dwt0748e4hnkiv0 | OWNER        | cjtuflye71dws0748r7vtuqmg | leclerc           |
Alan
  • 9,167
  • 4
  • 52
  • 70
0

In my opinion, this is the best answer (making it general):

SELECT  
TB1.Id  
FROM Table1 AS TB1  
INNER JOIN Table2 AS TB2 ON (TB1.Id = TB2.Id_TB1)  
    AND TB2.Id = (  
        SELECT Id  
        FROM Table2  
        WHERE TB1.Id = Id_TB1  
        ORDER BY Table2.Id DESC  
        LIMIT 1  
    )  
  • You're responding to a 10-year-old question. This answer will not limit the result, since the WHERE clause doesn't seem to do anything to make the JOIN selection unique. – Dennis Mar 01 '21 at 13:56
0

Assuming you want product with MIN()imial value in sort column, it would look something like this.

SELECT 
  c.id, c.title, p.id AS product_id, p.title
FROM 
  categories AS c
INNER JOIN (
  SELECT
    p.id, p.category_id, p.title
  FROM
    products AS p
  CROSS JOIN (
    SELECT p.category_id, MIN(sort) AS sort
    FROM products
    GROUP BY category_id
  ) AS sq USING (category_id)
) AS p ON c.id = p.category_id
Mchl
  • 61,444
  • 9
  • 118
  • 120
-1

When using postgres you can use the DISTINCT ON syntex to limit the number of columns returned from either table.

Here is a sample of the code:

SELECT c.id, c.title, p.id AS product_id, p.title FROM categories AS c JOIN ( SELECT DISTINCT ON(p1.id) id, p1.title, p1.category_id FROM products p1 ) p ON (c.id = p.category_id)
The trick is not to join directly on the table with multiple occurrences of the id, rather, first create a table with only a single occurrence for each id

Yaki Klein
  • 3,978
  • 3
  • 37
  • 34
  • Thanks for sharing. Unfortunately, MySQL 5.7 doesn't have many of the features that make this problem easy to solve in Postgres or SQL Server. MySQL 8 does, however, as it has the "WITH ... AS" syntax that can be applied as another answer suggests. If you find a way that would work for MySQL 5.7+, definitely let us know. – drapkin11 Aug 02 '21 at 22:31
-3

Replace the tables with yours:

SELECT * FROM works w 
LEFT JOIN 
(SELECT photoPath, photoUrl, videoUrl FROM workmedias LIMIT 1) AS wm ON wm.idWork = w.idWork
josliber
  • 43,891
  • 12
  • 98
  • 133
-43

I would try something like this:

SELECT C.*,
      (SELECT P.id, P.title 
       FROM products as P
       WHERE P.category_id = C.id
       LIMIT 1)
FROM categories C
goggin13
  • 7,876
  • 7
  • 29
  • 44