Generating missing data is an annoying problem. Usually it's best left to the application level.
Failing that, and if you have to do it in MySQL, the missing data has to come from somewhere. This can be either a dynamic sequence of numbers built from user variables and a union cross join - or if we know our domain values (like we do in this case) we can pre-build some tables to take care of it.
In this case you're operating on dates, so we need a few tables that will help us generate all the values you're interested in. I'd suggest these tables:
calendar_years(y integer)
calendar_months(m integer)
calendar_days(d integer)
calendar_hours(h integer)
calendar_minutes(m integer)
These tables would all be a single column, containing the range of values you're looking at. calendar_minutes
can just have 0
and 30
as values because we're just looking at half hour increments.
This is an instance where a cross join
is actually what we want.
select unix_timestamp(concat(y, '-', lpad(m, 2, '0'), '-', lpad(d, 2, '0'), ' ', lpad(h, 2, '0'), ':', lpad(mm, 2, '0'), ':', '00')) tt
from calendar_years
cross join calendar_months
cross join calendar_days
cross join calendar_hours
cross join calendar_minutes
Will now give us every possible combination of our table values, or in essence - every period we're interested in looking at.
We build those columns into a timestamp by using concat
and lpad
to turn them into a valid datetime
string, and then convert it with unix_timestamp
.
This gives us a set of values to left join against readed_messages
, which will give us your final answer:
select from_unixtime(tt), count(id)
from (
select unix_timestamp(concat(y, '-', lpad(m, 2, '0'), '-', lpad(d, 2, '0'), ' ', lpad(h, 2, '0'), ':', lpad(mm, 2, '0'), ':', '00')) tt
from calendar_years
cross join calendar_months
cross join calendar_days
cross join calendar_hours
cross join calendar_minutes
) q
left join readed_messages rm
on (1800 * floor(`date` / 1800)) = q.tt
where tt <> 0 -- this method is brute force, generates some invalid dates
group by tt
Obviously this will give us a lot of values, probably for a period far greater than you're interested in looking at, you can further filter it in the where
clause if you wish.
heres a demo