Lookup rows using pairs of values and find exact matching groups

Question

I have these tables:

Project

id	name	version
1	Pete	0.0.1
2	Swag	0.0.1
3	Swag	0.0.2
4	Swag	0.0.3
5	Kale	0.0.1
6	Kale	0.0.2

Person

id	name
1	Jake
2	Skye
3	Kieth
4	Jim
5	Eliz

Person_Project

id	person_id	project_id
1	1	1
2	2	1
3	2	2
4	3	1
5	3	3
6	4	1
7	4	4
8	5	1
9	5	2
10	5	5

Persons will have unique projects, means no two persons will work on the same projects.

I am writing a java based api where I receive a json request with different projects and I have to return the person who is working on exact projects given in request.

Request:

[
    {"name": "Pete", "version": "0.0.1"}
]

This should return Jake

Request:

[
    {"name": "Pete", "version": "0.0.1"},
    {"name": "Swag", "version": "0.0.1"}
]

This should return Skye

Request:

[
    {"name": "Pete", "version": "0.0.1"},
    {"name": "Swag", "version": "0.0.2"}
]

This should return Kieth

I am writing SQL for this and not getting what I needed.

This is what I am up to

SELECT pe.id, pe.name
FROM person pe
LEFT JOIN person_project pepr on pepr.person_id = pe.id
WHERE pe.id IN (
    SELECT pepr.person_id
    FROM project pr
    LEFT JOIN person_project pepr ON pepr.project_id = pr.id
    WHERE pr.name IN ('Pete', 'Swag') AND pr.version IN ('0.0.1', '0.0.2')
    GROUP BY pepr.project_id
    HAVING COUNT(pepr.project_id) = 2
)
GROUP BY pe.id, pe.name
HAVING COUNT(pe.id) = 2

This is not right as I am using IN for versions which will gets applied to other projects.

I dont understand. I see many people working on project 1 so why should it return just Jake for example? — Salman A, May 24 '22 at 18:59
@SalmanA i am looking for person who only working project 1. I need exact match not IN search — user09, May 24 '22 at 19:08
in the second request I am looking for a person who is exactly working on project 1 and project 2. Third request is little different where request project 3 instead of project 2 as the version is 0.0.2 — user09, May 24 '22 at 19:13

Salman A · Accepted Answer · 2022-05-24T20:01:51.923

2

You can use SQL relational division logic as described in this answer. You're interested in the part that says exact division/no remainder:

with project_list as (
    select id
    from project
    where exists (
        select *
        from (values
            ('pete', '0.0.1'),
            ('swag', '0.0.1')
        ) as user_input(name, version)
        where project.name = user_input.name and project.version = user_input.version
    )
), person_project_copy as (
    select person_id, case when project_list.id is not null then 1 end as is_required
    from person_project
    left join project_list on person_project.project_id = project_list.id
)
select person_id
from person_project_copy
group by person_id
having count(is_required) = (select count(*) from project_list)
and    count(*)           = (select count(*) from project_list)

DB<>Fiddle for all three examples

edited May 24 '22 at 20:01

answered May 24 '22 at 19:47

Salman A

262,204
82
430
521

It returning Person 1 even for bad data – user09 May 24 '22 at 20:20
No idea then. Works for your sample data. – Salman A May 24 '22 at 20:20
I mean when ('pete', '0.0.1'), ('swag', '0.0.5') it returns Person 1 instead of no records. Looks like it found Person only based on pete 0.0.1 and not considered swag 0.0.5 as it doesn't exist – user09 May 24 '22 at 20:36
(Revised) Handling this case is trivial. I have added a workaround here: https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=2455ba9258bae7c8356ff519da5aa4e0 (see 4th example). – Salman A May 24 '22 at 20:49

score 0 · Answer 2 · answered May 25 '22 at 19:31

This is a classic Relational Division Without Remainder question.

Start by placing your input data into a table variable or a Table Valued Parameter or a temp table.

Then you can use one of the standard relational division answers

SELECT
  p.name
FROM Person p
WHERE EXISTS (SELECT 1
    FROM Person_Project pp
    LEFT JOIN @input i
        JOIN Project prj ON prj.name = i.name AND prj.version = i.version
      ON pp.project_id = prj.id
    WHERE pp.person_id = p.id
    HAVING COUNT(prj.id) = COUNT(*)
       AND COUNT(prj.id) = (SELECT COUNT(*) FROM @input)
);

db<>fiddle

What this does is the following:

Take all Person rows where the following EXISTS subquery is true:
For each Person take all their Person_Project rows
Left-join the input data (joining the matching Project at the same time).
Group it up and ensure that the number of matches from the join is equal to the number of rows in the subquery...
... and the number is also equal to the total number of input rows.

There are other solutions to this also.

Lookup rows using pairs of values and find exact matching groups

2 Answers2