-1

hopefully someone has an idea for a more performant Oracle SQL.

Following Oracle PERSON-Table:

ID PERSON_ID REFERENCE_ID COMMENT CREATION_DATE
1 10066 666 comment 2 days ago 01-11-2022 09:16:00.00000
2 10066 111 single comment 01-11-2022 11:44:00.00000
3 10066 666 comment 1 day ago 02-11-2022 07:37:00.00000
4 33444 666 comment of different person 02-11-2022 09:54:00.00000
5 10066 666 comment today 03-11-2022 08:46:00.00000
6 10066 987 another comment 03-11-2022 09:02:00.00000
7 10066 987 another comment same day 03-11-2022 09:44:22.123456

I want to have only the recent timestamps of identical REFERENCE_ID results from a specific PERSON_ID.

So I expect 3 rows should be in the result list for PERSON_ID 10066:

PERSON_ID REFERENCE_ID COMMENT CREATION_DATE
10066 111 single comment 01-11-2022 11:44:00.00000
10066 666 comment today 03-11-2022 08:46:00.00000
10066 987 another comment same day 03-11-2022 09:44:22.123456

I came up with a subselect idea which works, but is possibly not the best / most performant solution:

SELECT * FROM PERSON_TABLE p WHERE CREATION_DATE = (
   SELECT MAX(CREATION_DATE)
   FROM PERSON_TABLE
   WHERE REFERENCE_ID = p.REFERENCE_ID AND PERSON_ID = 10066
);

Has someone a better idea? Or is my approach ok performance wise? I have the feeling there are more optimized statements / queries possible, maybe without subselect.

Thanks in advance, have a nice day!

  • 1
    Does this answer your question? [Select First Row of Every Group in sql](https://stackoverflow.com/questions/16529701/select-first-row-of-every-group-in-sql) – astentx Nov 03 '22 at 05:01
  • What is the logic behind? You say "only the recent timestamps", but they all have the same timestamp. So how do you for example now that the entry with id 7 should be selected from your sample data and not the entry with id 6? You can't be serious to decide this based on the id column and just select the highest id? – Jonas Metzler Nov 03 '22 at 05:15
  • @JonasMetzler thank you for your question. I mean "recent" the "youngest" Timestamp. The Timestamps above differ only in day (was too lazy to change times). So I expect the 03-11-2022 (today, younger than 02-11-2022) to be in the result set. Not the older timestamps. Yes ID 6 and 7 should differ in time, i edit this, was my mistake. Your idea with highest id could work because of the sequence, always new inserted ids are always younger. I will ask my boss if this is always the way – Eager Beaver Nov 03 '22 at 05:38
  • No, that was NOT my idea because I think this is the wrong way. "Was too lazy to change times" for just seven rows of sample data makes no good impression to me when asking for assistance. But thank you for editing the question to make your requirements more clear. – Jonas Metzler Nov 03 '22 at 05:43
  • /pedant : performant is a word, but it doesn't mean what you think..... – Mitch Wheat Nov 03 '22 at 06:02

3 Answers3

2

I would suggest this one:

WITH t (ID, person_id, reference_id, "COMMENT", creation_date) AS (
    SELECT 1, 1066, 666, 'comment 2 days ago', TO_DATE('01.11.2022 09:16', 'dd.mm.yyyy hh24:mi') FROM dual UNION ALL
    SELECT 2, 1066, 111, 'single comment', TO_DATE('01.11.2022 11:44', 'dd.mm.yyyy hh24:mi') FROM dual UNION ALL
    SELECT 3, 1066, 666, 'comment 1 day ago', TO_DATE('02.11.2022 07:37', 'dd.mm.yyyy hh24:mi') FROM dual UNION ALL
    SELECT 5, 1066, 666, 'comment of different person', TO_DATE('03.11.2022 08:46', 'dd.mm.yyyy hh24:mi') FROM dual UNION ALL
    SELECT 6, 1066, 987, 'comment today', TO_DATE('03.11.2022 09:02', 'dd.mm.yyyy hh24:mi') FROM dual UNION ALL
    SELECT 7, 1066, 987, 'another comment', TO_DATE('03.11.2022 09:44', 'dd.mm.yyyy hh24:mi') FROM dual
)
SELECT PERSON_ID, REFERENCE_ID,
    MIN("COMMENT") KEEP (DENSE_RANK LAST ORDER BY creation_date) AS "COMMENT",
    MIN(creation_date) KEEP (DENSE_RANK last ORDER BY creation_date) AS creation_date
FROM t
GROUP BY PERSON_ID, REFERENCE_ID;

It avoids self-join and should be the fastest solution.

Wernfried Domscheit
  • 54,457
  • 9
  • 76
  • 110
  • Why `DENSE_RANK` and not `ROW_NUMBER`? If the OP adds `MIN(another_column) KEEP (DENSE RANK)`, it becomes possible that the values for `comment` and `another_column` could be from different rows *(if multiple rows share the same creation date)*... – MatBailie Nov 03 '22 at 14:24
  • @MatBailie Because `DENSE_RANK` is the only possible, see [FIRST](https://docs.oracle.com/en/database/oracle/oracle-database/21/sqlrf/FIRST.html#GUID-85AB9246-0E0A-44A1-A7E6-4E57502E9238) Yes, it may return ties, the TO did not specify what should happen in this case. – Wernfried Domscheit Nov 03 '22 at 14:49
  • That's a very good reason! – MatBailie Nov 03 '22 at 16:10
1

One option is to rank rows per each person and reference, and then fetch the ones that rank as the highest.

Sample data:

SQL> with person (id, person_id, reference_id, creation_date) as
  2    (select 1, 1066, 666, to_date('01.11.2022 09:16', 'dd.mm.yyyy hh24:mi') from dual union all
  3     select 2, 1066, 111, to_date('01.11.2022 11:44', 'dd.mm.yyyy hh24:mi') from dual union all
  4     select 3, 1066, 666, to_date('02.11.2022 07:37', 'dd.mm.yyyy hh24:mi') from dual union all
  5     select 5, 1066, 666, to_date('03.11.2022 08:46', 'dd.mm.yyyy hh24:mi') from dual union all
  6     select 6, 1066, 987, to_date('03.11.2022 09:02', 'dd.mm.yyyy hh24:mi') from dual union all
  7     select 7, 1066, 987, to_date('03.11.2022 09:44', 'dd.mm.yyyy hh24:mi') from dual
  8    ),

Query begins here:

  9  temp as
 10    (select p.*,
 11            row_number() over (partition by person_id, reference_id order by creation_date desc) rn
 12     from person p
 13    )
 14  select id, person_id, reference_id, creation_date
 15  from temp
 16  where person_id = 1066
 17    and rn = 1;

        ID  PERSON_ID REFERENCE_ID CREATION
---------- ---------- ------------ --------
         2       1066          111 01.11.22
         5       1066          666 03.11.22
         7       1066          987 03.11.22

SQL>
Littlefoot
  • 131,892
  • 15
  • 35
  • 57
-2

If I am understanding the question correctly I think a DISTINCT clause may achieve the same results.

Something like select distinct REFERENCE_ID, PERSON_ID, COMMENT, CREATION_DATE from PERSON_TABLE order by REFERENCE_ID desc.

mio123
  • 1
  • thanks for the anwer. I tried distinct but it seems not to work. Maybe because I have to add WHERE PERSON_ID = 10066 or my SQL skills are just too bad – Eager Beaver Nov 03 '22 at 06:03
  • I think you better ignore this answer, it doesn't make sense. DISTINCT will slow down your query and does not apply any "recent" or "max" condition. – Jonas Metzler Nov 03 '22 at 06:47
  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Nov 07 '22 at 16:50