0

How can I write a sql statement which returns the latest record per day, per an Id. Eg. with the data as below.

Id   Name        Comment         Value    DateTime
1    Tim         Test            100      02/06/2020 15:05:12
2    Sue         House           200      03/06/2020 08:25:01
1    Tim         Test            150      02/06/2020 18:05:12
3    Doug        Cars            680      10/05/2019 04:45:10
2    Sue         Tennis          200      03/06/2020 10:35:15

I'd get :

Id   Name        Comment         Value    DateTime
1    Tim         Test            150      02/06/2020 18:05:12
3    Doug        Cars            680      10/05/2019 04:45:10
2    Sue         Tennis          200      03/06/2020 10:35:15

Would I need a sub-select query which groups by the max date time?

GMB
  • 216,147
  • 25
  • 84
  • 135
atamata
  • 997
  • 3
  • 14
  • 32
  • 1
    (1) What is your Oracle version? Different answers are available depending on that. (2) Why the `plsql` tag? (3) **Most importantly**: Can there be **ties**? Where there are two or more rows for the same Id and the same date, with exactly the same time-of-day component, all tied for "latest" in that goupr? If so, how should that be handled? Return ALL the rows tied for "latest record" for that person and date? Return only one of them, and if so, which one? (Or will "any one of the tied ones" work equally well?) –  Feb 13 '20 at 22:59

3 Answers3

1

A window function would be the approach I would take:

select id, name, comment, value, dateTime from
(
  select id, name, comment, value, dateTime
           , last_value(dateTime) over( partition by id, trunc(datetime)
                                        order by dateTime 
             rows  BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) lv
) 
where dateTime=lv
BobC
  • 4,208
  • 1
  • 12
  • 15
0

One simple way to do it is to filter with a correlated subquery:

select t.*
from mytable t
where t.datetime = (
    select max(t1.datetime)
    from mytable t1
    where t1.datetime >= trunc(t.datetime)
    and t1.datetime < trunc(t.datetime) + 1
)

You might also like then anti-left join approach:

select t.*
from mytable t
left join mytable t1 
    on  t1.datetime > t.datetime
    and t1.datetime >= trunc(t.datetime)
    and t1.datetime < trunc(t.datetime) + 1
where t1.id is null
GMB
  • 216,147
  • 25
  • 84
  • 135
  • 1
    Disadvantage of these methods if having to scan the table twice. Could be a performance problem if the table is large – BobC Feb 13 '20 at 23:13
  • @BobC: one way or another, you need to scan more than once (analytics functions do not really differ in that matter). – GMB Feb 13 '20 at 23:19
  • 1
    @GMB - I don't know why you believe that, but you are definitely wrong. Analytic functions were developed **specifically** with that improvement as the goal. The data may be processed more than once (inner query, outer query) but the **stored** data is read from disk only once. –  Feb 13 '20 at 23:49
  • @gmb, run an explain plan on the two solutions and you will see the number of scans. Also, I do not think your solution gives the desired results; the original question requires the results to be grouped/partitioned by ID. – BobC Feb 14 '20 at 01:53
0

By using FIRST/LAST you don't need even a sub-query:

WITH t(ID, NAME, COMMENT_, VALUE, DateTime) AS (
    SELECT 1,'Tim','Test', 100, TO_DATE('02/06/2020 15:05:12', 'dd/mm/yyyy hh24:mi:ss') FROM dual
    UNION ALL SELECT 2,'Sue','House', 200, TO_DATE('03/06/2020 08:25:01', 'dd/mm/yyyy hh24:mi:ss') FROM dual
    UNION ALL SELECT 1,'Tim','Test', 150, TO_DATE('02/06/2020 18:05:12', 'dd/mm/yyyy hh24:mi:ss') FROM dual
    UNION ALL SELECT 3,'Doug','Cars', 680, TO_DATE('10/05/2019 04:45:10', 'dd/mm/yyyy hh24:mi:ss') FROM dual
    UNION ALL SELECT 2,'Sue','Tennis', 300, TO_DATE('03/06/2020 10:35:15', 'dd/mm/yyyy hh24:mi:ss') FROM dual
    UNION ALL SELECT 2,'Sue','Steets', 400, TO_DATE('10/10/2020 10:35:15', 'dd/mm/yyyy hh24:mi:ss') FROM dual)
SELECT ID, 
    MAX(NAME) KEEP (DENSE_RANK LAST ORDER BY DateTime) AS NAME,
    MAX(COMMENT_) KEEP (DENSE_RANK LAST ORDER BY DateTime) AS COMMENT_,
    MAX(VALUE) KEEP (DENSE_RANK LAST ORDER BY DateTime) AS VALUE,
    MAX(DateTime) KEEP (DENSE_RANK LAST ORDER BY DateTime) AS DateTime
FROM t
GROUP BY ID, TRUNC(DateTime);

+------------------------------------------+
|ID|NAME|COMMENT_|VALUE|DATETIME           |
+------------------------------------------+
|1 |Tim |Test    |150  |02.06.2020 18:05:12|
|2 |Sue |Tennis  |300  |03.06.2020 10:35:15|
|2 |Sue |Steets  |400  |10.10.2020 10:35:15|
|3 |Doug|Cars    |680  |10.05.2019 04:45:10|
+------------------------------------------+
Wernfried Domscheit
  • 54,457
  • 9
  • 76
  • 110