When faced with this sort of problem, I use TDQD — Test-Driven Query Design.
Please note that it helps everyone if you give your table a name!
Pass 1
List of products with one or more attributes identical to product 1
SELECT a.ProductID, COUNT(*) AS matches
FROM EAV_Table AS a
JOIN EAV_Table AS b
ON a.Attribute = b.Attribute AND a.value = b.value
WHERE a.ProductID != 1
AND b.ProductID = 1
GROUP BY a.ProductID
This clearly won't list any products where the count is 0, which is fine.
List of products with one or more attributes mismatching product 1
SELECT c.ProductID, COUNT(*) AS matches
FROM EAV_Table AS c
JOIN EAV_Table AS d
ON c.Attribute = d.Attribute AND c.value != d.value
WHERE c.ProductID != 1
AND d.ProductID = 1
GROUP BY c.ProductID
This also won't list products where the count is 0, which is more of a nuisance.
Result — Pass 1
We need all the products from the first query where the product is not listed in the second query. That can be expressed by a NOT EXISTS and a correlated sub-query:
SELECT a.ProductID, COUNT(*) AS matches
FROM EAV_Table AS a
JOIN EAV_Table AS b
ON a.Attribute = b.Attribute AND a.value = b.value
WHERE a.ProductID != 1
AND b.ProductID = 1
AND NOT EXISTS
(SELECT c.ProductID
FROM EAV_Table AS c
JOIN EAV_Table AS d
ON c.Attribute = d.Attribute AND c.value != d.value
WHERE c.ProductID != 1
AND d.ProductID = 1
AND c.ProductID = a.ProductID
)
GROUP BY a.ProductID
That's pretty ugly. It works, but it is ugly.
Test Data
CREATE TABLE eav_table
(
productid INTEGER NOT NULL,
attribute CHAR(5) NOT NULL,
value INTEGER NOT NULL,
PRIMARY KEY(productid, attribute, value)
);
INSERT INTO eav_table VALUES(1, "Prop1", 1);
INSERT INTO eav_table VALUES(1, "Prop2", 2);
INSERT INTO eav_table VALUES(2, "Prop1", 1);
INSERT INTO eav_table VALUES(3, "Prop1", 1);
INSERT INTO eav_table VALUES(3, "Prop2", 3);
INSERT INTO eav_table VALUES(4, "Prop1", 1);
INSERT INTO eav_table VALUES(4, "Prop3", 1);
Q1 Result
2 1
3 1
4 1
Q2 Result
3 1
Q3 Result
2 1
4 1
The ones are the counts I generated; a more polished rendition would remove them.
Pass 2
If it can be managed, a nicer final query would join a table that lists all the product IDs which have at least one matching attribute/value pair in common with product ID 1 with a table that lists all the product IDs which have zero disagreements with product ID 1.
List of products with one or more attributes identical to product 1
The first query is the same as the first query in Pass 1, except we'll drop the count in the result set.
SELECT a.ProductID
FROM EAV_Table AS a
JOIN EAV_Table AS b
ON a.Attribute = b.Attribute AND a.value = b.value
WHERE a.ProductID != 1
AND b.ProductID = 1
GROUP BY a.ProductID
Either the GROUP BY clause or a DISTINCT in the select list is necessary, in general (though the sample data doesn't formally require it).
List of products with zero attributes that mismatch product 1
We will exploit the fact that COUNT(column)
only counts non-null values, and use a LEFT OUTER JOIN.
SELECT c.ProductID
FROM EAV_Table AS c
LEFT JOIN EAV_Table AS d
ON c.Attribute = d.Attribute
AND c.Value != d.Value
AND c.ProductID != 1
AND d.ProductID = 1
GROUP BY c.ProductID
HAVING COUNT(d.Value) == 0;
Note that the WHERE clause has been merged into the ON clause; this is actually rather important.
Result — Pass 2
We build the two queries above as sub-queries that are joined to generate the final result:
SELECT f.ProductID
FROM (SELECT a.ProductID
FROM EAV_Table AS a
JOIN EAV_Table AS b
ON a.Attribute = b.Attribute AND a.value = b.value
WHERE a.ProductID != 1
AND b.ProductID = 1
GROUP BY a.ProductID
) AS e
JOIN (SELECT c.ProductID
FROM EAV_Table AS c
LEFT JOIN EAV_Table AS d
ON c.Attribute = d.Attribute
AND c.Value != d.Value
AND c.ProductID != 1
AND d.ProductID = 1
GROUP BY c.ProductID
HAVING COUNT(D.Value) = 0
) AS f
ON e.ProductID = f.ProductID
This produces the answer 2 and 4 on the sample data.
Note that one part of this exercise is learning not to be satisfied with the first answer you develop. Note that it would be good to benchmark the solutions on full size data sets instead of test data sets with only 7 rows in the table.