1

so I'm wondering how you would write the following query. I've already figured out how to get what I want, but I'm sure it could be done better and my solution is a little bit ... well ... not how it should be :)

So I have 3 tables, the first being "articles" where articles (for a shop) are stored, then shopping "categories", for example menswear, women, pants, shirts etc. and a pivot-table to connect articles to multiple categories. Typical many-to-many relationship.

1) articles (id, name, price, ...)

2) categories (id, name, ...)

3) article_categories (article_id, category_id)

I want to retrieve all articles which are in both the categories with name "menswear" and "pants". Have in mind that the query might be extended to get articles that are in more categories. They should only be returned if they are in all the categories, not only one of them! Right now I managed to get it working using the following query which checks if the count of categories from GROUP BY is the same as the number of categories I have in the WHERE-clause:

SELECT articles.* 
FROM articles a, categories c, article_categories ac
WHERE ac.article_id = a.id
AND ac.category_id = c.id
AND (
c.name =  'menswear'
OR c.name =  'pants'
)
GROUP BY a.id
HAVING COUNT( c.id ) =2

So how would you write this query?

Thanks :)

  • 1
    possible duplicate of [How do I return only rows where all conditions are met based on an array of intergers (IDs)?](http://stackoverflow.com/questions/20888136/how-do-i-return-only-rows-where-all-conditions-are-met-based-on-an-array-of-inte) – Barmar Apr 04 '14 at 23:47
  • 1
    I've answered a number of similar questions (including the one I just linked to), and your solution is essentially how I've done it in the others. – Barmar Apr 04 '14 at 23:48
  • @Barmar I see, so it's actually not bad to do it like this I guess. Thanks ;) – Felix Neumeyer Apr 05 '14 at 00:08

1 Answers1

1

This way is probably going to be more efficient (faster) in your case...

SELECT a.* FROM
(
  SELECT ac.article_id
  FROM categories AS c
  JOIN article_categories AS ac
  ON c.id = ac.category_id
  AND c.`name` IN ('menswear', 'pants')
  GROUP BY ac.article_id
  HAVING COUNT(DISTINCT c.`name`) = 2
) AS ArticleIDs
JOIN
articles AS a
ON ArticleIDs.article_id=a.id;

It avoids doing a triple join all at once. Instead it just does the join from categories to article_categories and checks the HAVING condition after that. Then you can take that much smaller result set for your final join in the outer query.

See the SQLFiddle

If your tables have unique constraint checking you can probably skip the DISTINCT clause I use and go with your original version of the HAVING condition.

Would be great to hear if this is actually faster in production.

Andrew G
  • 349
  • 1
  • 6