0

Say you have a simple table that represents a time series for another entity identified by someID. Each row is identified by someID and a timestamp, which is not subject to any regular increment, i.e. intervals can vary:

CREATE TABLE someSeries
(
   someID int not null,
   rowTS datetime not null,
   val int not null
);
ALTER TABLE someSeries 
ADD CONSTRAINT PK_someSeries(someID, rowTS);

Is there an elegant and efficient way (so not using the cartesian product) to return all rows and display the row's rowTS and the most recent previous rowTS for that someID?

E.g. if the data is

someID        rowTS            val
------------------------------------
1             9/1/2012         2
1             9/2/2012         3
1             9/5/2012         5
2             9/2/2012         1
2             9/4/2012         6
3             9/5/2012         7
3             9/7/2012         9
3             9/10/2012        2

That query should return

someID        rowTS            prevRowTS          val          prevVal
------------------------------------------------------------------------
1             9/1/2012         null               2            null
1             9/2/2012         9/1/2012           3            2 
1             9/5/2012         9/2/2012           5            3
2             9/2/2012         null               1            null
2             9/4/2012         9/2/2012           6            1
3             9/5/2012         null               7            null
3             9/7/2012         9/5/2012           9            7
3             9/10/2012        9/7/2012           2            9

Currently, I need something like this in my app and the way I do it is in the application tier, basically I store the last rowTS in the someID master table where it is the PK and then, upon the time series insertion, I get that value from the master table and look up the most recent previous record, and do some computation (e.g. compare val and prevVal) and insert it in the time series table.

But I was wondering if there is a fast way to do it in just SQL. The only thing that comes to mind is the cartesian product and, needless to say, that is not very efficient.

amphibient
  • 29,770
  • 54
  • 146
  • 240

3 Answers3

2

For SQL Server, Oracle, and PostgreSQL - using Windowing functions

;with cte as (
select *, rn=row_number() over (partition by someid order by rowTS)
from someSeries
)
select a.someID, a.rowTS, b.rowTS prevRowTS, a.val, b.val prevVal
from cte a
left join cte b on a.someid = b.someID and b.rn = a.rn-1
order by a.someID, a.rowts

For SQL Server 2012 and Oracle, using LAG function which easily outperforms the above.

select
    someid,
    rowts,
    lag(rowts) over (partition by someid order by rowts) prevrowts,
    val,
    lag(val) over (partition by someid order by rowts) prevval
from someSeries
order by someid, rowts

For MySQL ONLY, hack but it performs really well.

select
  @ts:=rowts rowts,
  if(@s=someID,@ts,null) prevrowts,
  @v:=val val,
  if(@s=someID,@v,null) prevval,
  @s:=someID someID
from (select @s:=null) a, someSeries
order by someID, rowts

Note: while you may be tempted, DON'T move the someID column before the others.

RichardTheKiwi
  • 105,798
  • 26
  • 196
  • 262
1

Since you said that it doesn't matter what RDBMS you are using, here is how you do that in SQL Server:

;WITH cte
AS
(
    SELECT *, ROW_NUMBER() OVER(Partition BY someID ORDER BY someID, rowTS) row_num
    FROM @Temp
)
SELECT c1.someID, c1.rowTS, 
  (SELECT MAX(c2.rowTS) 
   FROM cte c2 
   WHERE c2.someID = c1.someID AND c2.row_num < c1.row_num) AS prevRowTS,
  c1.val,
  (SELECT MAX(c2.val) 
   FROM cte c2 
   WHERE c2.someID = c1.someID AND c2.row_num < c1.row_num) AS prevVal
FROM cte c1

Here is a live demo

Community
  • 1
  • 1
Mahmoud Gamal
  • 78,257
  • 17
  • 139
  • 164
0

This is very similar to this question: SQL subtract two rows based on date and another column

There are a number of solutions there.

Community
  • 1
  • 1
Mike Parkhill
  • 5,511
  • 1
  • 28
  • 38