I have a PostgreSQL 8.3.4 DB to keep information about photo taggings.
First off, my table definitions:
create table photos (
id integer
, user_id integer
, primary key (id, user_id)
);
create table tags (
photo_id integer
, user_id integer
, tag text
, primary key (user_id, photo_id, tag)
);
What I'm trying to do + simple example:
I am trying to return all the photos that have at least k other photos with at least j common tags.
I. e., if Photo X has these tags (info field in the tags table):
gold
clock
family
And photo Y has the next tags:
gold
sun
family
flower
X and Y have 2 tags in common. For k = 1
and j = 2
X and Y will be returned.
What I have tried
SELECT tags1.user_id , users.name, tags1.photo_id
FROM users, tags tags1, tags tags2
WHERE ((tags1.info = tags2.info) AND (tags1.photo_id != tags2.photo_id)
AND (users.id = tags1.user_id))
GROUP BY tags1.user_id, tags1.photo_id, tags2.user_id, tags2.photo_id, users.name
HAVING ((count(tags1.info) = <j>) and (count(*) >= <k>))
ORDER BY user_id asc, photo_id asc
My failed results:
When I tried to run it on those tables:
photos
photo_id user_id
0 0
1 0
2 0
20 1
23 1
10 3
tags
photo_id user_id tag
0 0 Car
0 0 Bridge
0 0 Sky
20 1 Car
20 1 Bridge
10 3 Sky
The result for k = 1
and j = 1
:
Expected:
| user_id | User Name | photo_id |
| 0 | Bob | 0 |
| 1 | Ben | 20 |
| 3 | Lev | 10 |
Actual:
| user_id | User Name | photo_id |
| 0 | Bob | 0 |
| 3 | Lev | 10 |
For k = 2
and j = 1
:
Expected:
| user_id | User Name | photo_id |
| 0 | Bob | 0 |
Actual: empty result.
For j = 2
and k = 2
:
Expected: empty result.
Actual:
| user_id | User Name | Photo ID |
| 0 | Bob | 0 |
| 1 | Ben | 20 |
How to solve this properly?