Tricky, challanging, but feasible :)
First of all, I tried to set up your scenario as slim as possible with the following statements:
create table issues (
issueon date not null primary key
);
insert into issues values('2015-01-10'), ('2015-01-15'), ('2014-10-25');
insert into issues values('2015-01-08'), ('2015-01-20'), ('2014-1-10');
(Without loss of generality, I am only doing this for a single magazine).
Here's the complete SELECT
statement with which I came up after some testing with assuming that I am trying to detect the current, the previous and the next issue date around '2015-01-11'
:
select tableNext.theIssue, theNext, thePrevious
from
(
select basis.issueon as theIssue, target.issueon as thePrevious, datediff(basis.issueon, target.issueon) as surroundingdistance
from issues as basis, issues as target
where basis.issueon = (
select closestissue from (
select issueon as closestissue, abs(datediff('2015-01-11', issueon)) as distance from issues order by distance asc limit 1
) as theissue
) and datediff(basis.issueon, target.issueon) > 0
order by surroundingdistance limit 1
) as tableNext
left join
(
select basis.issueon as theIssue, target.issueon as theNext, datediff(basis.issueon, target.issueon) as surroundingdistance
from issues as basis, issues as target
where basis.issueon = (
select closestissue from (
select issueon as closestissue, abs(datediff('2015-01-11', issueon)) as distance from issues order by distance asc limit 1
) as theissue
) and datediff(basis.issueon, target.issueon) < 0
order by surroundingdistance desc limit 1
) as tablePrevious
on tableNext.theIssue = tableNext.theIssue
;
Don't panic - it's a monster and I'll try to explain it step by step:
Innerst Statement -- First Layer
Let's first have a look at the inner-most select statement:
select issueon as closestissue, abs(datediff('2015-01-11', issueon)) as distance
from issues order by distance asc limit 1
The statement computes the "distance" between the dates of all issues and the date where we started. To ensure that we are looking both into the forward and into the backward direction, the absolute value of the distance is taken. Using order by distance limit 1
we are determining the "nearest match" (NB: This could also be done using GROUP BY
and HAVING
clauses, but this is much less liked by MySQL than LIMIT 1
).
The value that is "the best match" then can be found in attribute closestissue
.
Second layer
Now let's have a look at
select closestissue from (
select issueon as closestissue, abs(datediff('2015-01-11', issueon)) as distance from issues order by distance asc limit 1
) as theissue
With this one, we are just stripping away the attribute distance
as it will be hindering us in a second
Third layer
On the third layer, we are taking the result of the second layer and use that one as input on a WHERE
condition to select the "right" issue from the issues
table again:
select basis.issueon as theIssue, target.issueon as theNext, datediff(basis.issueon, target.issueon) as surroundingdistance
from issues as basis, issues as target
where basis.issueon = (
select closestissue from (
select issueon as closestissue, abs(datediff('2015-01-11', issueon)) as distance from issues order by distance asc limit 1
) as theissue
)
Afterwards, we are joining the same table again (self-join) in a cartesian product. Note, that the left table of the join only may contain a single record (assuming that there is only one issue per day, which I hope is safe). Therefore, this cartesian product is not risky, as it will only return as many records as we anyhow have in the table issues
(so we stay in complexity O(n)
). The free hand-side of this right (the "right side") is called target
in the query.
We are now able to compare the "distance" from the "proper issue date" with all dates in the table again. We do so by calculating datediff(basis.issueon, target.issueon) as surroundingdistance
again.
Previous and Next
Once we have the sourroundingdistance
it's getting easy to apply the same pattern as on the first layer. However, this time we may not just take the absolute distance, but we have to look at two distinct sets:
- Those which were earlier (
sourroundingdistance < 0
), and
- those which were later (
sourroundingdistance > 0
)
Doing the same trick now again with ORDER BY
+ LIMIT 1
, we can isolate a single issue. However, as we have to consider those two sets seperately, we have to execute the same query (with some minor modifications) twice -- that's why that beast gets so long-ish.
Putting it all together
We now have two queries, which
- contain the corresponding issue date, and
- indicate the previous respectively the next issue date.
The common attribute for these two queries is the corresponding issue date. Therefore, we can join these two queries with each other via this common attribute.
And voilà: The "monster" query provides the following desired result:
theIssue theNext thePrevious
2015-01-10 2015-01-15 2015-01-08
Amendment
Though I had a brief look at the execution plan (EXPLAIN
) of the query and was astonished that the complexity still was quite limited, from a maintenance perspective such a query is a nightmare.
If you are able and you may afford to execute multiple statements sequentially, there might be better approaches than firing such a huge query to the server. For example, you could fire a first query containing the content of the first layer, getting the result and using that one as input for a second query, with which you are determining the "next issue". A third query could retrieve the "previous issue".
Also, if your environment permits it, you may consider writing a function (CREATE FUNCTION
) for it, which encapsulates all this in a more sequential way.
All these are just variants of the same approach. In the end, I hope I could equip you with the necessary "weapons" to tackle this issue using SQL.