3

I have the following three simple T-SQL queries. First one is to get records within a range of boundaries (DATETIME type):

 SELECT value, timestamp 
 FROM myTable
 WHERE timestamp BETWEEN @startDT AND @endDT

the second one is to get the closest record to @startDT (DATETIME type)

 SELECT TOP 1
   value, timestamp
 FROM myTable
 WHERE timestamp > @startDT
 ORDER BY timestamp DESC

and the last one is to get the closest record after @endDT:

 SELECT TOP 1
   value, timestamp
 FROM myTable
 WHERE timestamp < @endDT
 ORDER BY timestamp ASC

I would like to get all the records of above three queries as one group of records. I tried to use UNION, but it seems that sub-queries within UNION does not allow ORDER BY clause. Is there efficient way to get my result?

 . .  * | * * * * * | * . . .
      start        end

The above graph simply shows the records of *s as my required records, and |...| is the boundaries.

By the way, the amount of data in myTable is huge. My understanding UNION is not an efficient way to get data from UNIONs. Any efficient way to get data without UNION?

David.Chu.ca
  • 37,408
  • 63
  • 148
  • 190
  • Are you sure you're getting the desired results from these queries with `ORDER BY value, timestamp`? Based on your description, I would think you'd want to order on only timestamp. Otherwise, you're getting the smallest value closest to the boundary. – Joe Stefanelli Jun 10 '11 at 20:14
  • You are right, I changed it by timestamp only. – David.Chu.ca Jun 10 '11 at 20:16
  • It is nice to see so many answers to meet my need. @Piotr Auguscik mentioned that min/max is better than Order + top 1 in terms of performance. Is that right? Based on answers, I figured out a "min/max + Group By" solution. Not sure which one is better. Keep in mind that myTable contains huge amount of records in my case. – David.Chu.ca Jun 11 '11 at 00:16

5 Answers5

3

As you wish, without UNION.

MySQL (TESTED)

SELECT 
     dv1.timestamp, dv1.values
FROM 
     myTable AS dv1
WHERE 
    dv1.timestamp 
    BETWEEN (
           SELECT dv2.timestamp 
           FROM myTable AS dv2
           WHERE dv2.timestamp < '@START_DATE' 
           ORDER BY dv2.timestamp DESC 
           LIMIT 1
           )
    AND ( SELECT dv3.timestamp 
          FROM myTable AS dv3
          WHERE dv3.timestamp > '@END_DATE' 
          ORDER BY dv3.timestamp ASC 
          LIMIT 1
        )

EDIT Sorry, I forgot to notice about T-SQL.

T-SQL (NOT TESTED)

SELECT 
     dv1.timestamp, dv1.values
FROM 
     myTable AS dv1
WHERE 
    dv1.timestamp 
    BETWEEN (
           SELECT TOP 1 dv2.timestamp 
           FROM myTable AS dv2
           WHERE dv2.timestamp >  @START_DATE 
           ORDER BY dv2.timestamp DESC
           )
    AND ( SELECT TOP 1 dv3.timestamp 
          FROM myTable AS dv3
          WHERE dv3.timestamp <  @END_DATE 
          ORDER BY dv3.timestamp ASC
        )

Note If the result is not right, you could just exchange the sub queries (i.e. operators, and ASC/DESC).

Think out of the box :)

rhzs
  • 516
  • 7
  • 24
  • Nice. I am not 100% sure if UNION is less efficient. Is that true? – David.Chu.ca Jun 10 '11 at 21:04
  • 1
    By the way, LIMIT 1 is not supported in T-SQL. I think it should be TOP 1 instead? – David.Chu.ca Jun 10 '11 at 21:18
  • Some say yes. In advance, you could compare the query execution time by using `EXPLAIN the_query` :) However, UNION seems to be useful in some cases, but I prefer to avoid it. – rhzs Jun 10 '11 at 21:23
  • Yes, I am sorry. I tested in MySQL. But, It should also be worked in T-SQL. – rhzs Jun 10 '11 at 21:25
  • you do realize of course this makes no sense. as written your where clause is equivalent to `WHERE dv1.timestamp btween @START_DATE and @END_DATE` – Conrad Frix Jun 10 '11 at 21:58
  • +1 because its close to what the OP wants. Also you'd need to remove the single quotes around the parameters. To see a working example go [here](http://data.stackexchange.com/gis/s/1465/sample-for-sql-query-records-within-a-range-of-boundaries-and-max-min-outside-the) You can write your own if you want – Conrad Frix Jun 10 '11 at 22:26
1

U can use max/min to get value u need. Order by +top 1 isnt best way to get max value, what i can see in ur querys. To sort n items its O(n to power 2), getting max should be only O(n)

Piotr Auguscik
  • 3,651
  • 1
  • 22
  • 30
  • Then I have to use GROUP BY if I use max/min. My queries are simplified ones. Actually there are more than 2 fields in a records such as value1, value2, ..., timestamp. – David.Chu.ca Jun 10 '11 at 20:29
1
SELECT value, timestamp 
 FROM myTable
 WHERE timestamp BETWEEN @startDT AND @endDT
union
select A.Value, A.TimeStamp
From (
 SELECT TOP 1
   value, timestamp
 FROM myTable
 WHERE timestamp > @startDT
 ORDER BY value, timestamp DESC ) A
Union
Select A.Value, A.TimeStamp
From (
 SELECT TOP 1
   value, timestamp
 FROM myTable
 WHERE timestamp < @endDT
 ORDER BY value, timestamp ASC ) A
Jeff
  • 5,913
  • 2
  • 28
  • 30
1

The second and third queries in your post don't make much sense because

WHERE timestamp > @startDT

and

WHERE timestamp < @endDT

result in timestamps INSIDE the range, but your descriptions

 . .  * | * * * * * | * . . .
      start        end

The above graph simply shows the records of *s as my required records, and |...| is the boundaries.

means something different.

So following the descriptions and using the following mapping

myTable = Posts
value = score
timestamp  = creationdate 

I wrote this query on data.stackexchange.com (modified from exodream's answer but with the comparison operators in the correct reverse direction)

DECLARE @START_DATE datetime
DECLARE @END_DATE datetime
SET @START_DATE = '2010-10-20'
SET @END_DATE = '2010-11-01'

SELECT score, 
       creationdate 
FROM   posts 
WHERE  creationdate BETWEEN (SELECT TOP 1 creationdate 
                             FROM   posts 
                             WHERE  creationdate < @START_DATE 
                             ORDER  BY creationdate DESC) 
                             AND 
                                   (SELECT TOP 1 creationdate 
                                    FROM   posts 
                                    WHERE  creationdate > @END_DATE 
                                    ORDER  BY creationdate ASC) 
ORDER by creationDate

Which outputs

score creationdate        
----- ------------------- 
4     2010-10-19 23:55:48 
3     2010-10-20 2:24:50   
6     2010-10-20 2:55:54  
...
...
7     2010-10-31 23:14:48 
4     2010-10-31 23:18:17 
4     2010-10-31 23:18:48 
0     2010-11-01 3:59:38  

(382 row(s) affected)

Note how the first row and last rows are just outside the limits of the range

Community
  • 1
  • 1
Conrad Frix
  • 51,984
  • 12
  • 96
  • 155
0

You can put those ordered queries into subqueries to get around not being able to UNION them directly. A little annoying, but it'll get you what you want.

SELECT value, timestamp 
 FROM myTable
 WHERE timestamp BETWEEN @startDT AND @endDT
 UNION
 SELECT value, timestamp 
 FROM (
 SELECT TOP 1
   value, timestamp
 FROM myTable
 WHERE timestamp > @startDT
 ORDER BY value, timestamp DESC 
 ) x
 UNION
 SELECT value, timestamp
 FROM (
 SELECT TOP 1
   value, timestamp
 FROM myTable
 WHERE timestamp < @endDT
 ORDER BY value, timestamp ASC
 ) x
dpmattingly
  • 1,301
  • 1
  • 7
  • 11