Your problem is that you have essentially two queries here, with different ordering. If you're not able to actually run them as two queries (that is, in your application layer), you need to get fancy.
However:
- The problem with
ORDER BY CASE...
solutions is that they are (almost) guaranteed to never use an index - the optimizer can't page up/down through the index because the order is dependent on the row value.
- The same limitation as above applies to doing various types of "math" on the desired columns. Plus the additional overhead of whatever math you're doing.
If your result set is small, the above is a non-issue. On dbs of any real size, though, this is important. So...
When you have two (semi-related) queries to combine, you use UNION ALL
(or UNION
if there are duplicates to remove). The problem is that the resulting sets are stitched together in an "undefined" order (this is not always explicitly spelled out in documentation). Reasons for this vary, but will include if the system is using data parallelism, which side may currently be waiting for a lock, or just because it's Tuesday in Sunnydale. So we'll need an ORDER BY
...
Except the result in your query essentially requires two (different) ORDER BY
s. And we can only supply one for use with a UNION (ALL)
operator (because the sort will be performed on the combined set). Fortunately, different ordering can be given for each column for the clause! And UNION
would allow us to supply a "dummy" null column - so where does that sort results?
SELECT id, COALESCE(date, other) AS sortedDate
FROM (SELECT id, date, null AS other
FROM <sourceTable>
WHERE date >= CURRENT_TIMESTAMP
UNION ALL
SELECT id, null, date
FROM <sourceTable>
WHERE date < CURRENT_TIMESTAMP) Derived
ORDER BY date ASC, other DESC
PostgreSQL Fiddle Demo
Why does this work? In RDBMSs, null
s sort to one end of the range. For PostgreSQL, Oracle, and DB2, they are sorted as the "greatest" value. That is, an intermediate table that looks like this is generated:
id | date | other
7 | July 15 2014 12:00 AM | null
8 | July 15 2014 1:00 PM | null
9 | July 15 2014 2:00 PM | null
10 | July 15 2014 3:00 PM | null
6 | null | June 1 2014 12:00 AM
5 | null | May 1 2014 12:00 AM
4 | null | April 1 2014 12:00 AM
3 | null | March 1 2014 12:00 AM
2 | null | February 1 2014 12:00 AM
1 | null | January 1 2014 12:00 AM
... you can see how this compares with the ordering expressed in the ORDER BY
, right?
Unfortunately, for MySQL and SQL Server, null
is the lowest value, so we have to change our statement!
SELECT id, COALESCE(date, other) AS sortedDate
FROM (SELECT id, 1 AS orderGroup, date, null AS other
FROM <sourceTable>
WHERE date >= CURRENT_DATE
UNION ALL
SELECT id, 2, null, date
FROM <sourceTable>
WHERE date < CURRENT_DATE) Derived
ORDER BY orderGroup ASC, date ASC, other DESC
SQL Server Fiddle Demo
... here, the intermediate table looks like this:
id | og | date | other
7 | 1 | July 15 2014 12:00 AM | null
8 | 1 | July 15 2014 1:00 PM | null
9 | 1 | July 15 2014 2:00 PM | null
10 | 1 | July 15 2014 3:00 PM | null
6 | 2 | null | June 1 2014 12:00 AM
5 | 2 | null | May 1 2014 12:00 AM
4 | 2 | null | April 1 2014 12:00 AM
3 | 2 | null | March 1 2014 12:00 AM
2 | 2 | null | February 1 2014 12:00 AM
1 | 2 | null | January 1 2014 12:00 AM
(incidently, this version will work even when null
sorts as "greatest")
So what about the ability for the db to use indices?
I don't know. (I don't have a good box/sample set to test on anyways...)
It's going to depend on how smart the optimizer is, and if it can extract the constants from the SELECT
lists. If so, I expect that it's going to scan the relevant index twice - once ascending and once descending, from the start of the range.