3

I have a table that contains index data, with each row having a start and end time. It's trivial to query this table and get a list of the index rows whose timespans fall within an overarching time period (usually one day from 00:00:00 to 23:59:59).

declare @date datetime, @start datetime, @end datetime

set @date = GetDate() //in production code this is a n input to a stored proc

//the date component of the current date/time, starting at midnight
set @start = DateAdd(dd, DateDiff(dd, 0, @date), 0) 

//one second before midnight the next day
set @end = DateAdd(dd, 1, DateAdd(ss, -1, @start)) 

select idx, start_time, end_time
from index_data
where start_time <= @end
and end_time >= @start
order by start_time 

The results would be something like this:

idx         start_time                  end_time
---------------------------------------------------------------
495640      2012-05-03 00:17:13.000     2012-05-03 00:17:45.000
495641      2012-05-03 00:18:20.000     2012-05-03 00:18:51.000
495642      2012-05-03 00:18:55.000     2012-05-03 00:19:31.000
495643      2012-05-03 00:34:08.000     2012-05-03 00:34:28.000
495644      2012-05-03 00:36:21.000     2012-05-03 00:36:41.000
495646      2012-05-03 01:22:21.000     2012-05-03 01:22:38.000
495647      2012-05-03 01:24:38.000     2012-05-03 01:24:55.000
495648      2012-05-03 01:30:11.000     2012-05-03 01:30:29.000
495649      2012-05-03 01:31:23.000     2012-05-03 01:31:39.000
495650      2012-05-03 02:09:57.000     2012-05-03 02:10:59.000
495651      2012-05-03 02:11:00.000     2012-05-03 02:11:00.000
495652      2012-05-03 02:14:25.000     2012-05-03 02:14:42.000
495653      2012-05-03 02:31:09.000     2012-05-03 02:31:25.000
495655      2012-05-03 03:02:32.000     2012-05-03 03:02:51.000
...

What I need is an efficient query (no cursors or other loops) that will produce a result set giving me a row for each hour and minute of the given day, for which at least one second of that minute falls within at least one index's time range:

hour   min
----------
0      17
0      18
0      19
0      34
0      36
1      22
1      24
1      30
1      31
2      9
2      10
2      11
2      14
2      31
3      02
...

Each index's start_time and end_time are usually less than 60 secs apart (thus producing several rows per minute, hence the desire to consolidate to save network bandwidth), but I can't guarantee that; some rows could span a longer period, for instance a start_time of 02:20:34 and an end_time of 02:23:43. Given a span like this, the result set must include 2:20, 2:21, 2:22 and 2:23, which you won't get by UNIONing a query for the start_time with a query for the end_time.

Here's the kicker: the query must be compatible with the MSDE engine, which is basically MSS 2000. So, no CTEs (or I would have this done already).

KeithS
  • 70,210
  • 21
  • 112
  • 164

1 Answers1

2

If by efficient you mean minimum CPU time, Reads, etc, then you need a look-up table.

Create a table that contains all 1440 minutes of the day as DATETIMEs.

DECLARE
  @date  DATETIME,
  @start DATETIME,
  @end   DATETIME
SELECT
  @date  = GetDate(),
  @start = DateAdd(dd, DateDiff(dd, 0, @date), 0),
  @end   = DateAdd(dd, 1, @start)

SELECT
  minute_lookup.timestamp,
  COUNT(*)                    AS total_records
FROM
  index_data
INNER JOIN
  minute_lookup
    ON  minute_lookup.timestamp >= index_data.start_time - @date
    AND minute_lookup.timestamp <  index_data.end_time   - @date
WHERE
      index_data.start_time < @end
  AND index_data.end_time   > @start
GROUP BY
  minute_lookup.timestamp
ORDER BY
  minute_lookup.timestamp

Effectively you're caching some of the calculations. No trying to fill gaps between data-points.


An Aside on DateTime Boundaries

Also, note that I use the start and end values differently.

If I want to cover the whole of a day, I don't say 2012-01-01 00:00:00 to 2012-01-01 23:59:59. What if the table includes a value half a second after that?

DateTimes are not really discrete values, they're continuous. As such I use FROM 2012-01-01 00:00:00 upto, but excluding, 2012-01-02 00:00:00

  • x >= '2012-01-01' AND x < '2012-01-02'

In this way, no matter what level of accuracy x is stored as, if it represents anything in 1st Jan 2012, I've captured it.

MatBailie
  • 83,401
  • 18
  • 103
  • 137
  • Beat me to it by mere seconds. +1 It's almost identical to using a numbers table as described here: http://stackoverflow.com/questions/10819/sql-auxiliary-table-of-numbers – RThomas May 03 '12 at 23:41