I have the following SQL query that selects any row matching ANY of the values in the list (9, 10):
SELECT
r.id, r.title,
u.name as 'Created By',
c.name as 'Category',
c.value,
cr.category_id
FROM
category_resource cr
INNER JOIN resource r
ON cr.resource_id = r.id
INNER JOIN user u
ON r.created_by = u.id
INNER JOIN category c
ON cr.category_id = c.id
WHERE cr.category_id IN ('10', '9');
I have tried finding out how to do the opposite, which I also need, i.e select rows that match ALL values.
I have read about using a statement sort of like this:
SELECT
r.id, r.title
FROM
resource r
WHERE
id IN (
SELECT
resource_id
FROM
category_resource
WHERE
category_id IN (9, 10)
GROUP BY
resource_id
HAVING
COUNT(DISTINCT category_id) = 2
);
This is my attempt at adapting this answer to my needs: SQL Server - select rows that match all items in a list
But that doesn't get me as much information back in the results as the first statement does. So how could I do something that is more equivalent? I've tried to put it together, but I'm too new to SQL to figure it out right, I just get errors...
LONG UPDATE:
Gordon Linoff pointed out that this is a strange request. And I know, I feel that it’s strange too, to have a query that returns multiple rows for the same resource. But I don’t know how to do this any other way, maybe I’m on the wrong track altogether, because it just hit me that the original request (the one getting all rows for resources with categories matching ANY in the list) does quite not fulfill my requirement either…
Here’s my requirement as a whole:
First of all, I think the model of this part of the db might be helpful.
(BTW category has a relation with itself also because it is stored as a hierarchy, using the adjacency model, where each category stores its parent id, if anyone was wondering about that arrow...)
1: Find all resources where a category matches ANY of the values in the list. But (this is where that was insufficient) for each of these resources, I need to know the resource along with all of the categories it has…
Let me explain the point of this with a simple example:
This is a many to many relationship as you can see. A resource (e.g let’s say with the title ”Introduction to carpentry in 18th century New England”) can be associated with many categories (e.g category.name = ”subject” value = ”Carpentry”, category.name=”subject” value = ”Wood”, category.name=”subject” value = ”New England”, category.name=”subject” value = ”History”). Note that this example is simplified, but you see the basic idea.
Now, if a user searches for a resource matching ANY of the categories ”Carpentry” and ”Painting”, the resource ”Introduction to carpentry in 18th century New England” should show up in the results, since one of its categories matched. But, here’s the rub, and why Gordon felt my request was strange: in the search result I want to present to the user, I want to list the title ”Introduction to carpentry in 18th century New England” along with a column showing all the categories that title is classified with, even though the user didn’t search for them - in order to give a better overview of the complete subject matter of this resource.
So how can I do that? The only way I could think of was the first statement in my question, but as I said it just hit me that it doesnt’ give me all categories a resource might have, only the ones actually searched for…
Of course I could do a query for the results first, only getting one row for each. And then do a second query, finding all categories for each resource in the results. But if the first query gives me 1000 results (which will be common), then to get the categories for all of these I would have to do 1000 queries to get the categories for each… Sounds like that would give me performance issues…
Am I thinking about this the wrong way? Is there another way to accomplish what I want to do? I.e give me the resources the query selects, along with all of that resource’s associated categories...
2: Well, after that long explanation, the second requirement is simpler to explain: again the same thing about getting all categories back for a resource selected, but this time the selection in the query should only get those resources that match ALL of the provided values. However, just because I provide all of the values in the query again doesn’ mean that I already have all the categories, since each resource in the results may actually have more (and other) categories and I need those too when presenting the results as mentioned in the first (ANY) requirement.