0

I've built a query (in Oracle) that selects any row with the max date out of duplicated rows. I based my query on the one presented here, which uses a nested grouping:

SELECT *
FROM (
      SELECT Train, MAX(Time) as MaxTime
      FROM TrainTable
      GROUP BY Train
) r
INNER JOIN TrainTable t
ON t.Train = r.Train AND t.Time = r.MaxTime

Now, since that query doesn't account for duplicated values in Time (as commented there), I'd like to take the first record out of each "duplicated" grouped record, and to still be able to use select *.

How can I do it?

(P.S. I tried using the other solution (using over (partition ...)), but it didn't work, and I'd need to figure it out)

OfirD
  • 9,442
  • 5
  • 47
  • 90
  • So basically, you want to be able to use "SELECT * FROM " and also a Group By clause? Honestly, even if you get a satisfactory answer it will be inferior to learning and understanding what "the other solution" does with a partition. It's more complicated to understand but well worth your time. – Barry Piccinni Aug 06 '18 at 14:55
  • @BarryPiccinni, I'm certainly going to do that, but also want to know how to use this kind of query :) – OfirD Aug 06 '18 at 15:02

4 Answers4

1

If you have PK in table (TrainTable) then you can use fetch first clause with subquery :

select t.*
from traintable t
where t.pk = (select t1.pk
              from traintable t1
              where t1.train = t.train 
              order by t1.time desc
              fetch first 1 rows only
             );
Yogesh Sharma
  • 49,870
  • 5
  • 26
  • 52
1

Here is another way you can make it without window function

You can try to use a subquery in select, to get COUNT be row_number and set the partition by columns in subquery where clause.

CREATE TABLE TrainTable(
    train varchar(10),
    Time  date
);
INSERT INTO TrainTable values ('A',to_date('2017-01-01','YYYY-MM-DD'));
INSERT INTO TrainTable values ('A',to_date('2018-01-01','YYYY-MM-DD'));
INSERT INTO TrainTable values ('B',to_date('2017-05-01','YYYY-MM-DD'));
INSERT INTO TrainTable values ('B',to_date('2017-05-05','YYYY-MM-DD'));
INSERT INTO TrainTable values ('C',to_date('2017-01-01','YYYY-MM-DD'));

Query 1:

SELECT *
FROM 
(
    SELECT t.*,(SELECT 
                COUNT(*) FROM TrainTable t1 
                WHERE t1.Time >= t.Time AND 
                      t1.train = t.train
               ) rn
    FROM TrainTable t
    ORDER BY t.Time DESC
) t
where t.rn = 1

Results:

| TRAIN |                 TIME | RN |
|-------|----------------------|----|
|     A | 2018-01-01T00:00:00Z |  1 |
|     B | 2017-05-05T00:00:00Z |  1 |
|     C | 2017-01-01T00:00:00Z |  1 |
D-Shih
  • 44,943
  • 6
  • 31
  • 51
1

If you insist in using this example in your answer, you could lump your current query into a WITH block and then SELECT DISTINCT like so:

WITH query AS (
  SELECT r.*
  FROM (
    SELECT train, MAX(trainTime) as MaxTime
    FROM trainTimes
    GROUP BY train
  ) r
  INNER JOIN trainTimes t
  ON t.train = r.train AND t.trainTime = r.MaxTime
)
SELECT DISTINCT *
FROM query;

Here is a SQL fiddle of this working: SQL Fiddle

HOWEVER

This is inefficient, and whilst it works for the current example it will probably get messy in a real DB. If Gordon's answer gives you the desired output, it's a far better solution.

Barry Piccinni
  • 1,685
  • 12
  • 23
0

Use row_number():

select t.*
from (select t.*,
             row_number() over (partition by train order by time desc) as seqnum
      from traintable t
     ) t
where seqnum = 1;

This returns an arbitrary row when there are ties for the matching time. SQL tables represent unordered sets, so there is no "first" row, unless another column specifies that ordering. If so, then you can include that column in the order by clause.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • It seems there's no easy-enough *and* performant way to do it without using `partition`, so although this answer doesn't use the original query, I'll accept it. – OfirD Aug 06 '18 at 18:36