GROUP BY x, y
means "I want one result row per x and y". Your query says "give me one result row per pidm and date" and you get exactly that. If you want one result ro per pidm instead, then group by pidm.
Apart from that, you only want rows where an ACC1 for the pidm exists with a score of at least 81. In order to do so, you search all such rows, only to use DISTINCT
then to boil it down to the pidm or null. My general advice: Use EXISTS
or IN
when you merely want to make sure that at least one matching entry exists. This gives the DBMS a lot less work to do, because it can stop with the first match found. In below query I am showing both approaches, a correlated EXISTS
clause for the AAC2 lookup and a non-correlated IN
clause for the ACC1 lookup.
select sortest_pidm pidm, max(sortest_test_date) as max_test_date
from sortest a
where sortest_tesc_code in ('ACC1', 'ACC2')
and sortest_pidm in
(
select sortest_pidm
from sortest
where sortest_tesc_code = 'ACC1' and sortest_test_score >= 81
)
and exists
(
select null
from sortest b
where b.sortest_pidm = a.sortest_pidm
and b.sortest_tesc_code = 'ACC2' and b.sortest_test_score >= 95
)
--and sortest_pidm = 319824
group by sortest_pidm
order by sortest_pidm;
But in your query you are looking at all ACC1 and ACC2 rows anyway in order to get the maximum date per pidm. That means that when aggregating we can say: please count the matches and only show me pidms for which the number of matches is greater than zero. This leads to this query:
select sortest_pidm pidm, max(sortest_test_date) as max_test_date
from sortest a
where sortest_tesc_code in ('ACC1', 'ACC2')
--and sortest_pidm = 319824
group by sortest_pidm
having count(case when sortest_tesc_code = 'ACC1' and sortest_test_score >= 81 then 1 end) > 1
and count(case when sortest_tesc_code = 'ACC2' and sortest_test_score >= 95 then 1 end) > 1
order by sortest_pidm;
This technique is called conditional aggregation (because we count based on conditions).