0

The following query returns two rows. A PIDM (number) and two different sortest_test_date values.

319824|21-MAR-18
319824|18-APR-18

I would like the query to return a single row containing the PIDM and the max sortest_test_date. How do I accomplish this?

select distinct a.sortest_pidm pidm, max(a.sortest_test_date) max_test_date
from sortest a
where (a.sortest_tesc_code = 'ACC1' or a.sortest_tesc_code = 'ACC2')
and a.sortest_pidm = (select distinct sortest_pidm from sortest b where a.sortest_pidm = b.sortest_pidm and b.sortest_tesc_code = 'ACC1' and b.sortest_test_score >= 81)
and a.sortest_pidm = (select distinct sortest_pidm from sortest b where a.sortest_pidm = b.sortest_pidm and b.sortest_tesc_code = 'ACC2' and b.sortest_test_score >= 95)
and a.sortest_pidm = 319824
group by a.sortest_pidm, a.sortest_test_date;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • 1
    Why do you group by date if you want maximum date **per `sortest_pidm`**? Also `distinct` is not needed in case of aggregation, because result is already grouped (surprisingly, `group by` does grouping). Please, check the [documentation](https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/SELECT.html#GUID-CFA006CA-6FF1-4972-821E-6996142A51C6) for the meaning of each part of `select` – astentx Sep 22 '21 at 23:46
  • this is why supplied the answer i did. which just didn't include code because it assumed he knew what he was doing. – John Sohn Sep 23 '21 at 16:56

4 Answers4

0

Basically, you need to remove the date from the group by. You can also make other improvements to the query as well.

select a.sortest_pidm pidm, max(a.sortest_test_date) max_test_date
from sortest a
where a.sortest_tesc_code in ('ACC1', 'ACC2') and
      a.sortest_pidm = (select sortest_pidm from sortest b where a.sortest_pidm = b.sortest_pidm and b.sortest_tesc_code = 'ACC1' and b.sortest_test_score >= 81
                       ) and
      a.sortest_pidm = (select sortest_pidm from sortest b where a.sortest_pidm = b.sortest_pidm and b.sortest_tesc_code = 'ACC2' and b.sortest_test_score >= 95) and
      a.sortest_pidm = 319824
group by a.sortest_pidm;

Basically, there is no reason to have distinct anywhere in the query. I suspect that the subqueries can be improved as well (say by using window functions), but your question doesn't provide enough information to make specific suggestions.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
0

Use TOP 1 (SQL Server) in select statement to get top row. Use order by to order the result set by date in desc order.

Jaguar Nation
  • 122
  • 11
  • It looks like the OP wants the max date per pidm or even for one single pidm, not the pidm for the max date in the table. But who knows. Your query is invalid though. You cannot select pidm and max(date) without a group by pidm. I suppose you wanted to remove the `MAX`. It is also wrong that we use `ROWNUM < 2` in order to get the maximum date, because `ROWNUM` happens before `ORDER BY`. We use the standard SQL `FETCH FIRST` clause where SQL Server uses their propriatary `TOP`. Then, if you select one row with `TOP 1`, what is `DISTINCT` for in your query? – Thorsten Kettner Sep 23 '21 at 05:21
  • @ThorstenKettner - Didn't have SSMS to validate the query... was written in free hand... Deleted SQL script and gave simple solution. Thanks for reviewing my code. – Jaguar Nation Sep 24 '21 at 09:03
  • He/She should give this answer the check mark. Lol – Jaguar Nation Sep 24 '21 at 09:09
0

Just Simple It.

Select a.sortest_pidm pidm, a.sortest_test_date max_test_date
from sortest a join (
Select sortest_pidm pidm, max(a.sortest_test_date) max_test_date
from sortest 
group by sortest_pidm
) b on (a.pidm = b.pidm and a.sortest_test_date = b.max_test_date);
0

GROUP BY x, y means "I want one result row per x and y". Your query says "give me one result row per pidm and date" and you get exactly that. If you want one result ro per pidm instead, then group by pidm.

Apart from that, you only want rows where an ACC1 for the pidm exists with a score of at least 81. In order to do so, you search all such rows, only to use DISTINCT then to boil it down to the pidm or null. My general advice: Use EXISTS or IN when you merely want to make sure that at least one matching entry exists. This gives the DBMS a lot less work to do, because it can stop with the first match found. In below query I am showing both approaches, a correlated EXISTS clause for the AAC2 lookup and a non-correlated IN clause for the ACC1 lookup.

select sortest_pidm pidm, max(sortest_test_date) as max_test_date
from sortest a
where sortest_tesc_code in ('ACC1', 'ACC2')
and sortest_pidm in
(
  select sortest_pidm
  from sortest
  where sortest_tesc_code = 'ACC1' and sortest_test_score >= 81
)
and exists
(
  select null
  from sortest b
  where b.sortest_pidm = a.sortest_pidm
  and b.sortest_tesc_code = 'ACC2' and b.sortest_test_score >= 95
)
--and sortest_pidm = 319824
group by sortest_pidm
order by sortest_pidm;

But in your query you are looking at all ACC1 and ACC2 rows anyway in order to get the maximum date per pidm. That means that when aggregating we can say: please count the matches and only show me pidms for which the number of matches is greater than zero. This leads to this query:

select sortest_pidm pidm, max(sortest_test_date) as max_test_date
from sortest a
where sortest_tesc_code in ('ACC1', 'ACC2')
--and sortest_pidm = 319824
group by sortest_pidm
having count(case when sortest_tesc_code = 'ACC1' and sortest_test_score >= 81 then 1 end) > 1
   and count(case when sortest_tesc_code = 'ACC2' and sortest_test_score >= 95 then 1 end) > 1
order by sortest_pidm;

This technique is called conditional aggregation (because we count based on conditions).

Thorsten Kettner
  • 89,309
  • 7
  • 49
  • 73