Two tables - one with e-mails and videos watched by e-mail owner and the other with videos and cast names worked in them. As I understand the e-mail owners could watch and cast names could participate in multiple videos with cross over possibilities. Something like in this two tables (as sample data):
WITH
t_watch AS
(
Select 'aaa.aaa@aaa.aa' "E_MAIL", 101 "VIDEO_ID" From Dual Union All
Select 'ccc.ccc@ccc.cc' "E_MAIL", 101 "VIDEO_ID" From Dual Union All
Select 'aaa.aaa@aaa.aa' "E_MAIL", 102 "VIDEO_ID" From Dual Union All
Select 'bbb.bbb@bbb.bb' "E_MAIL", 102 "VIDEO_ID" From Dual Union All
Select 'ccc.ccc@ccc.cc' "E_MAIL", 201 "VIDEO_ID" From Dual Union All
Select 'ddd.ddd@ddd.dd' "E_MAIL", 101 "VIDEO_ID" From Dual Union All
Select 'ddd.ddd@ddd.dd' "E_MAIL", 201 "VIDEO_ID" From Dual Union All
Select 'aaa.aaa@aaa.aa' "E_MAIL", 301 "VIDEO_ID" From Dual
),
t_work_in AS
(
Select 101 "VIDEO_ID", 'Chriss' "CAST_NAME" From Dual Union All
Select 101 "VIDEO_ID", 'David' "CAST_NAME" From Dual Union All
Select 101 "VIDEO_ID", 'Annie' "CAST_NAME" From Dual Union All
Select 102 "VIDEO_ID", 'Chriss' "CAST_NAME" From Dual Union All
Select 201 "VIDEO_ID", 'Chriss' "CAST_NAME" From Dual Union All
Select 201 "VIDEO_ID", 'David' "CAST_NAME" From Dual Union All
Select 201 "VIDEO_ID", 'Annie' "CAST_NAME" From Dual Union All
Select 202 "VIDEO_ID", 'Annie' "CAST_NAME" From Dual Union All
Select 301 "VIDEO_ID", 'Robert' "CAST_NAME" From Dual
),
The question is about getting the e-mail address of a person watched all the videos participated by specific cast name. To get the answer you should get the lists of:
- unique ordered video ids watched by particular e-mail owner
- unique ordered video ids that particular cast name worked in
To do that create two ctes - cte_watched_by and cte_worked_in:
cte_worked_in AS
(
SELECT DISTINCT
wk.CAST_NAME,
LISTAGG(wk.VIDEO_ID, ', ') WITHIN GROUP (Order By wk.VIDEO_ID) OVER(Partition By wk.CAST_NAME) "WORKED_LIST",
Count(DISTINCT wk.VIDEO_ID) OVER(Partition By wk.CAST_NAME) "COUNT_IDS_WORKED"
FROM
(Select DISTINCT wrk.CAST_NAME, wrk.VIDEO_ID From t_work_in wrk Left Join t_watch wtc ON(wtc.VIDEO_ID = wrk.VIDEO_ID) Order By wrk.VIDEO_ID) wk
),
cte_watched_by AS
(
SELECT DISTINCT
wb.E_MAIL,
LISTAGG(wb.VIDEO_ID, ', ') WITHIN GROUP (Order By wb.VIDEO_ID) OVER(Partition By wb.E_MAIL) "WATCHED_LIST",
Count(DISTINCT wb.VIDEO_ID) OVER(Partition By wb.E_MAIL) "COUNT_IDS_WATCHED"
FROM
(Select DISTINCT wtc.E_MAIL, wtc.VIDEO_ID From t_watch wtc Left Join t_work_in wrk ON(wrk.VIDEO_ID = wtc.VIDEO_ID) Order By wtc.VIDEO_ID) wb
)
This is what we have got so far:
cte_worked_in
CAST_NAME |
WORKED_LIST |
COUNT_IDS_WORKED |
Chriss |
101, 102, 201 |
3 |
Robert |
301 |
1 |
Annie |
101, 201, 202 |
3 |
David w101, 201 |
2 |
|
cte_watched_by
E_MAIL |
WATCHED_LIST |
COUNT_IDS_WATCHED |
ccc.ccc@ccc.cc |
101, 201 |
2 |
ddd.ddd@ddd.dd |
101, 201 |
2 |
bbb.bbb@bbb.bb |
102 |
1 |
aaa.aaa@aaa.aa |
101, 102, 301 |
3 |
Now we can join those ctes and get the answer - below is main SQL:
SELECT DISTINCT
wk.CAST_NAME "WORKED_IN",
wk.COUNT_IDS_WORKED "COUNT_IDS_WORKED",
wk.WORKED_LIST,
wb.E_MAIL "WATCHED_BY",
wb.COUNT_IDS_WATCHED "COUNT_IDS_WATCHED",
wb.WATCHED_LIST
FROM
cte_worked_in wk
INNER JOIN
cte_watched_by wb ON
(
(wk.COUNT_IDS_WORKED = wb.COUNT_IDS_WATCHED And wk.WORKED_LIST = wb.WATCHED_LIST)
OR
(wk.COUNT_IDS_WORKED = 1 And wb.COUNT_IDS_WATCHED > 1 And InStr(wb.WATCHED_LIST, wk.WORKED_LIST) > 0)
OR
(wk.COUNT_IDS_WORKED = 2 And wb.COUNT_IDS_WATCHED > 2 And InStr(wb.WATCHED_LIST, SubStr(wk.WORKED_LIST, 1, 3)) > 0 And InStr(wb.WATCHED_LIST, SubStr(wk.WORKED_LIST, 6, 3)) > 0)
)
... and here is resulting dataset:
WORKED_IN |
COUNT_IDS_WORKED |
WORKED_LIST |
WATCHED_BY |
COUNT_IDS_WATCHED |
WATCHED_LIST |
Robert |
1 |
301 |
aaa.aaa@aaa.aa |
3 |
101, 102, 301 |
David |
2 |
101, 201 |
ddd.ddd@ddd.dd |
2 |
101, 201 |
David |
2 |
101, 201 |
ccc.ccc@ccc.cc |
2 |
101, 201 |
The ORs within ON expression of INNER JOIN serve to cover situations where somebody watch a lot of movies with cast names that works in just few movies. It is the weakest part of this answer.
The resulting dataset could be filtered (if needed) to show just results of a particular cast name and/or e-mail.
Regards...