I've got a database where each entry is an edge with a source tag, a relationship and a weight. I want to perform a query where given a source tag, I get the top n edges by weight with that source tag per relationship.
For example, given the entries
Id Source Relationship End Weight
-----------------------------------------
1 cat isA feline 56
2 cat isA animal 12
3 cat isA pet 37
4 cat desires food 5
5 cat desires play 88
6 dog isA canine 72
If I queried using "cat" as a source and n=2, the result should be
Id Source Relationship End Weight
-----------------------------------------
1 cat isA feline 56
3 cat isA pet 37
4 cat desires food 5
5 cat desires play 88
I've tried several different approaches based on other questions.
The most sucessful so far is based on How to SELECT the newest four items per category?
SELECT *
FROM tablename t1
JOIN tablename t2 ON (t1.relationship = t2.relationship)
LEFT OUTER JOIN tablename t3
ON (t1.relationship = t3.relationship AND t2.weight < t3.weight)
WHERE t1.source = "cat"
AND t3.relationship IS NULL
ORDER BY t2.weight DESC;
However, this returns all the edges with source="cat" in sorted order. If I try to add LIMIT, I get the edges with the top weights not by group.
The other thing that I have tried is
SELECT *
FROM tablename t1
WHERE t1.source="cat"
AND (
SELECT COUNT(*)
FROM tablename t2
WHERE t1.relationship = t2.relationship
AND t1.weight <= t2.weight
) <= 2;
This returns
Id Source Relationship End Weight
-----------------------------------------
1 cat isA feline 56
4 cat desires food 5
5 cat desires play 88
Because edge 6 has a higher weight for the isA relationship than edge 2, but is excluded from the results because the source="dog"
I am very new to databases, so if I need to take a completely different approach, let me know. I'm not afraid of starting over.