0

So, I have this SQL table:

Traveller      checkin       dateTime
5566           Madrid        2017-01-01 01:00:00.00
5566           Barcelona     2017-01-02 03:00:00.00
5566           Berlin        2017-01-03 02:00:00.00
5566           Paris         2017-01-06 05:00:00.00
5566           London        2017-01-07 06:00:00.00
5566           Madrid        2017-01-08 02:00:00.00
4422           Moscow        2017-01-03 08:00:00.00
4422           Madrid        2017-01-04 07:00:00.00
4422           Barcelona     2017-01-05 03:00:00.00
8833           Barcelona     2017-02-01 08:00:00.00
8833           Berlin        2017-02-02 04:00:00.00
8833           London        2017-02-03 01:00:00.00
8833           Berlin        2017-02-03 22:00:00.00
9966           Paris         2017-02-03 04:00:00.00
9966           London        2017-02-04 06:00:00.00
9966           Berlin        2017-02-05 01:00:00.00
...            ...           ...

Is it possible to somehow order these into a from-to table, using all but the first checkin as both a from and to destination. Like this:

Traveller      From       To
5566           Madrid     Barcelona
5566           Barcelona  Berlin
5566           Berlin     Paris
5566           Paris      London
5566           London     Madrid
4422           Moscow     Madrid
4422           Madrid     Barcelona
8833           Barcelona  Berlin
8833           Berlin     London
8833           London     Berlin
...            ...        ...

I know the basics of SQL but I'm still learning, so if anyone can think of a way this could be done (or not), help me understand how it works.

Thanks a lot!

Katalo
  • 43
  • 6

3 Answers3

2

You can use ROW_NUMBER() to assign sequential numbers to each traveller's checkins, in a CTE, and then join the CTE to itself to create the result:

declare @t table (Traveller int not null, checkin varchar(19) not null,
                  TimeStamp datetime2 not null)
insert into @t (Traveller,checkin,TimeStamp) values
(5566,'Madrid',   '2017-06-02T07:56:01'),
(5566,'Barcelona','2017-06-02T07:56:02'),
(5566,'Berlin',   '2017-06-02T07:56:03'),
(5566,'Paris',    '2017-06-02T07:56:04'),
(5566,'London',   '2017-06-02T07:56:05'),
(5566,'Madrid',   '2017-06-02T07:56:06'),
(4422,'Moscow',   '2017-06-02T07:56:07'),
(4422,'Madrid',   '2017-06-02T07:56:08'),
(4422,'Barcelona','2017-06-02T07:56:09'),
(8833,'Barcelona','2017-06-02T07:56:10'),
(8833,'Berlin',   '2017-06-02T07:56:11'),
(8833,'London',   '2017-06-02T07:56:12'),
(8833,'Berlin',   '2017-06-02T07:56:13'),
(9966,'Paris',    '2017-06-02T07:56:14'),
(9966,'London',   '2017-06-02T07:56:15'),
(9966,'Berlin',   '2017-06-02T07:56:16')

;With Numbered as (
    select
        *,
        ROW_NUMBER() OVER (PARTITION BY Traveller ORDER by TimeStamp) as rn
    from @t
)
select
    n1.Traveller,n1.checkin,n2.checkin
from
    Numbered n1
        inner join
    Numbered n2
        on
            n1.Traveller = n2.Traveller and
            n1.rn = n2.rn - 1
order by
    n1.Traveller,n1.rn

Results:

Traveller   checkin             checkin
----------- ------------------- -------------------
4422        Moscow              Madrid
4422        Madrid              Barcelona
5566        Madrid              Barcelona
5566        Barcelona           Berlin
5566        Berlin              Paris
5566        Paris               London
5566        London              Madrid
8833        Barcelona           Berlin
8833        Berlin              London
8833        London              Berlin
9966        Paris               London
9966        London              Berlin
Damien_The_Unbeliever
  • 234,701
  • 27
  • 340
  • 448
0

You can use row_number window function. It's Sql Server 2008 way to handle this problem. Here it is:

;with raw_data (id, city, arrival) as (
    select 5566, 'Madrid', '2017-01-01 01:00:00.00' union all
    select 5566, 'Barcelona', '2017-01-02 03:00:00.00' union all
    select 5566, 'Berlin', '2017-01-03 02:00:00.00' union all
    select 5566, 'Paris', '2017-01-06 05:00:00.00' union all
    select 5566, 'London', '2017-01-07 06:00:00.00' union all
    select 5566, 'Madrid', '2017-01-08 02:00:00.00' union all
    select 4422, 'Moscow', '2017-01-03 08:00:00.00' union all
    select 4422, 'Madrid', '2017-01-04 07:00:00.00' union all
    select 4422, 'Barcelona', '2017-01-05 03:00:00.00' union all
    select 8833, 'Barcelona', '2017-02-01 08:00:00.00' union all
    select 8833, 'Berlin', '2017-02-02 04:00:00.00' union all
    select 8833, 'London', '2017-02-03 01:00:00.00' union all
    select 8833, 'Berlin', '2017-02-03 22:00:00.00' union all
    select 9966, 'Paris', '2017-02-03 04:00:00.00' union all
    select 9966, 'London', '2017-02-04 06:00:00.00' union all
    select 9966, 'Berlin', '2017-02-05 01:00:00.00'
)
, arrivals as (
    select
        id, city, arrival,
        row_number() over (partition by id order by arrival) as rn
    from raw_data
)
, flies as (
    select
        fr.id,
        fr.city as [from],
        fr.arrival as [departure],
        [to].city as [to],
        [to].arrival as [arrival]
    from arrivals fr
    join arrivals [to] on
        fr.id = [to].id
        and [to].rn = fr.rn + 1
)
select
    *
from flies

SQL Server 2012 and higher

But if you have Sql Server 2012 and higher you can use solve it with lag function, which gives access to some previous row in current.

E.g. lag(colX, N, <default>) means you got N-th previous value of column colX or value if there isn't. It's just we need! Strictly the previous one:

;with raw_data (id, city, arrival) as (
    -- omitted for the sake of shortness :)
)
, flies as (
    select
        id,
        -- here it is!!! prev city and arrival
        lag(city, 1, null) over (partition by id order by arrival) as [from],
        lag(arrival, 1, null) over (partition by id order by arrival) as [departure],
        city as [to],
        arrival as [arrival]
    from raw_data
)
select
    *
from flies
where
    -- and here we take only rows from where we're 'departured'
    [from] is not null
pkuderov
  • 3,501
  • 2
  • 28
  • 46
  • What do i put in -- omitted for the sake of shortness – Katalo Jun 02 '17 at 07:36
  • @Katalo the same part from first query - filling example data. It just so many duplicate lines so I omitted them. And sorry for renaming you original columns - just missed that in a hurry( – pkuderov Jun 02 '17 at 07:38
0

You can use the LEAD function to return the next row's value, eg

select 
    Traveller,
    checkin,
    LEAD(checkin,1) OVER(PARTITION by traveller order by timestamp) as destination
from @t t

This will return a NULL destination for the last leg. To remove this, you can use a CTE to filter the data:

with checkins as (
    select 
        Traveller,
        checkin,LEAD(checkin,1) OVER(PARTITION by traveller order by timestamp) as destination
    from @t t)
select * from checkins 
where destination is not null

To test all of this :

declare @t table (Traveller int not null, checkin varchar(19) not null, TimeStamp datetime2 not null);

insert into @t (Traveller,checkin,TimeStamp) values
(5566,'Madrid',   '2017-06-02T07:56:01'),
(5566,'Barcelona','2017-06-02T07:56:02'),
(5566,'Berlin',   '2017-06-02T07:56:03'),
(5566,'Paris',    '2017-06-02T07:56:04'),
(5566,'London',   '2017-06-02T07:56:05'),
(5566,'Madrid',   '2017-06-02T07:56:06'),
(4422,'Moscow',   '2017-06-02T07:56:07'),
(4422,'Madrid',   '2017-06-02T07:56:08'),
(4422,'Barcelona','2017-06-02T07:56:09'),
(8833,'Barcelona','2017-06-02T07:56:10'),
(8833,'Berlin',   '2017-06-02T07:56:11'),
(8833,'London',   '2017-06-02T07:56:12'),
(8833,'Berlin',   '2017-06-02T07:56:13'),
(9966,'Paris',    '2017-06-02T07:56:14'),
(9966,'London',   '2017-06-02T07:56:15'),
(9966,'Berlin',   '2017-06-02T07:56:16');



with checkins as (
    select 
        Traveller,
        checkin,LEAD(checkin,1) OVER(PARTITION by traveller order by timestamp) as destination
    from @t t)
select * from checkins 
where destination is not null

Will return :

Traveller   checkin             destination
----------- ------------------- -------------------
4422        Moscow              Madrid
4422        Madrid              Barcelona
5566        Madrid              Barcelona
5566        Barcelona           Berlin
5566        Berlin              Paris
5566        Paris               London
5566        London              Madrid
8833        Barcelona           Berlin
8833        Berlin              London
8833        London              Berlin
9966        Paris               London
9966        London              Berlin
Panagiotis Kanavos
  • 120,703
  • 13
  • 188
  • 236
  • Whilst we may encourage people to move to supported versions, the sad reality is that people do continue to work with older versions. As such, since the OP has tagged specifically as 2008, this won't solve their problem. – Damien_The_Unbeliever Jun 02 '17 at 07:26
  • @Damien_The_Unbeliever the Technet docs don't show results for unsupported versions unless you explicitly type the version number in the URL. Finding what works in these versions is not that easy – Panagiotis Kanavos Jun 02 '17 at 07:27
  • Hey, sorry for the inconvenience, but its a SQL server 2012 im running – Katalo Jun 02 '17 at 07:34
  • @Katalo if you check the different SQL queries you'll see why this is a significant difference. Affects performance too – Panagiotis Kanavos Jun 02 '17 at 07:36
  • This does what I asked for! Thanks! – Katalo Jun 02 '17 at 08:04