0

update: changed one time to show that the times per shipment may not be in sequential order always.

here is my input

create table test
(
    shipment_id int,
    stop_seq tinyint,
    time datetime
)

insert into test values (1,1,'2009-8-10 8:00:00')
insert into test values (1,2,'2009-8-10 9:00:00')
insert into test values (1,3,'2009-8-10 10:00:00')
insert into test values (2,1,'2009-8-10 13:00:00')
insert into test values (2,2,'2009-8-10 14:00:00')
insert into test values (2,3,'2009-8-10 20:00:00')
insert into test values (2,4,'2009-8-10 18:00:00')

the output that i want is below

shipment_id  start    end
-----------  -----    ---
     1        8:00    10:00
     2        13:00   18:00

i need to take the time from the min(stop) row for each shipment and the time from the max(stop) row and place in start/end respectively. i know this can be done with multiple queries rather easily but i am looking to see if a single select query can do this.

thanks!

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
thomas
  • 2,592
  • 1
  • 30
  • 44

6 Answers6

4

I think the only way you'll be able to do it is with sub-queries.

SELECT shipment_id
    , (SELECT TOP 1 time 
        FROM test AS [b] 
        WHERE b.shipment_id = a.shipment_id 
        AND b.stop_seq = MIN(a.stop_seq)) AS [start]
    , (SELECT TOP 1 time 
        FROM test AS [b] 
        WHERE b.shipment_id = a.shipment_id 
        AND b.stop_seq = MAX(a.stop_seq)) AS [end]
FROM test AS [a]
GROUP BY shipment_id

You'll need to use the DATEPART function to chop up the time column to get your exact output.

MyItchyChin
  • 13,733
  • 1
  • 24
  • 44
  • the times may not always be in sequential order...i forgot to show that in my example...updated my question to reflect this. – thomas Aug 10 '09 at 20:22
  • this works. thanks. i knew it could be done with multiple selects but the purist in me was looking for alternatives. – thomas Aug 10 '09 at 20:43
  • @thomas : If you create a clustered index on shipment_id and stop_seq the impact of doing the sub-queries will be negligible because all of the clauses are sargable and will perform index seeks. – MyItchyChin Aug 10 '09 at 21:19
  • @thomas: If you have a shipments table, see the addendum to my solution for a more efficient alternative to CptSkippy's solution, but along the same lines. – Steve Kass Aug 10 '09 at 23:25
  • 1
    @CptSkippy: Not everyone might realize that your TOP 1 is there to avoid getting an error if the subqueries return more than one row. Personally, I'd leave TOP 1 out for two reasons. First, TOP 1 without ORDER BY is a bad pattern to use in general, because it's nondeterministic, and second, because if the subquery returns more than one row, chances are that there is a flaw in design or logic somewhere that is only being hidden. Just my 2c worth. Your solution is a good one, though, and probably as efficient as any! – Steve Kass Aug 10 '09 at 23:31
  • @Steve: All valid points and I agree with you to an extent. I think it really boils down to the amount of control you have over the systems you're working with. – MyItchyChin Aug 11 '09 at 01:25
1

Use a Common Table Expression (CTE) - this works (at least on my SQL Server 2008 test system):

WITH SeqMinMax(SeqID, MinID, MaxID) AS
(
    SELECT Shipment_ID, MIN(stop_seq), MAX(stop_seq)
    FROM test
    GROUP BY Shipment_ID
)
SELECT 
    SeqID 'Shipment_ID',
    (SELECT TIME FROM test 
       WHERE shipment_id = smm.seqid AND stop_seq = smm.minid) 'Start',
    (SELECT TIME FROM test 
       WHERE shipment_id = smm.seqid AND stop_seq = smm.maxid) 'End'
FROM seqminmax smm

The SeqMinMax CTE selects the min and max "stop_seq" values for each "shipment_id", and the rest of the query then builds on those values to retrieve the associated times from the table "test".

CTE's are supported on SQL Server 2005 (and are a SQL:2003 standard feature - no Microsoft "invention", really).

Marc

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
0

Am I correct in thinking that you want the first time rather than the 'min' time, and the last time in the sequence rather than the 'max' time?

fras
  • 123
  • 1
  • 3
  • 8
  • that is correct. the time associated with the first stop (i.e. min(stop_seq)) and the time associated with the last stop (i.e. max(stop_seq)) – thomas Aug 10 '09 at 20:40
0
SELECT C.shipment_id, C.start, B2.time AS stop FROM
(    
   SELECT A.shipment_id, B1.time AS start, A.max_stop_seq FROM
   (
      SELECT shipment_id, MIN(stop_seq) as min_stop_seq, MAX(stop_seq) as max_stop_seq 
      FROM test
      GROUP BY shipment_id
   ) AS A

   INNER JOIN 

   (
      SELECT shipment_id, stop_seq, time FROM test
   ) AS B1

   ON A.shipment_id = B1.shipment_id AND A.min_stop_seq = B1.stop_seq
) AS C

INNER JOIN

(
   SELECT shipment_id, stop_seq, time FROM test
) AS B2

ON C.shipment_id = B2.shipment_id AND C.max_stop_seq = B2.stop_seq
Richard Dunlap
  • 1,957
  • 11
  • 18
0
select t1.shipment_id, t1.time start, t2.time [end]
from (
    select shipment_id, min(stop_seq) min, max(stop_seq) max
    from test
    group by shipment_id
) a
inner join test t1 on a.shipment_id = t1.shipment_id and a.min = t1.stop_seq 
inner join test t2 on a.shipment_id = t2.shipment_id and a.max = t2.stop_seq 
D'Arcy Rittich
  • 167,292
  • 40
  • 290
  • 283
0

I suggest you take advantage of row_number and PIVOT. This may look messy, but I think it will perform well, and it's more adaptable to various assumptions. For example, it doesn't assume that the latest datetime value corresponds to the largest stop_seq value for a given shipment.

with test_ranked(shipment_id,stop_seq,time,rankup,rankdown) as (
  select
    shipment_id, stop_seq, time,
    row_number() over (
      partition by shipment_id
      order by stop_seq
    ),
    row_number() over (
      partition by shipment_id
      order by stop_seq desc
    )
  from test
), test_extreme_times(shipment_id,tag,time) as (
  select
    shipment_id, 'start', time
  from test_ranked where rankup = 1
  union all
  select
    shipment_id, 'end', time
  from test_ranked where rankdown = 1
)
  select
    shipment_id, [start], [end]
  from test_extreme_times
  pivot (max(time) for tag in ([start],[end])) P
  order by shipment_id;
  go

The PIVOT isn't really needed, but it's handy. However, do note that the MAX inside the PIVOT expression doesn't do anything useful. There's only one [time] value for each tag, so MIN would work just as well. The syntax requires an aggregate function in this position.

Addendum: Here's an adaptation of CptSkippy's solution that may be more efficient than using MIN and MAX if you have a shipments table:

SELECT shipment_id
    , (SELECT TOP 1 time 
        FROM test AS [b] 
        WHERE b.shipment_id = a.shipment_id 
        ORDER BY stop_seq ASC) AS [start]
    , (SELECT TOP 1 time 
        FROM test AS [b] 
        WHERE b.shipment_id = a.shipment_id 
        ORDER BY stop_seq DESC) AS [end]
FROM shipments_table AS [a];
Steve Kass
  • 7,144
  • 20
  • 26
  • thanks steve. this looks a little more complicated than I am looking for given the specific need and the fact that I already have a bunch of CTEs in my stored procedure and don't want to clutter it up entirely :-) I do like the row_number() with partition...i need to research that some more to understand its capabilities better. thanks for you help and suggestions! – thomas Aug 11 '09 at 02:10