6

I need some help with a mysql query. I've got db table that has data from Jan 1, 2011 thru April 30, 2011. There should be a record for each date. I need to find out whether any date is missing from the table.

So for example, let's say that Feb 2, 2011 has no data. How do I find that date?

I've got the dates stored in a column called reportdatetime. The dates are stored in the format: 2011-05-10 0:00:00, which is May 5, 2011 12:00:00 am.

Any suggestions?

pilcrow
  • 56,591
  • 13
  • 94
  • 135
Laxmidi
  • 2,650
  • 12
  • 49
  • 81
  • 1
    So you're saying that a single date wouldn't have a row at all or that it has a row and there's no data in another column? If it doesn't have a row the only way I can think of would be to loop through each date and check it. – Cfreak Jul 01 '11 at 17:23
  • You mean reportdatetime has the entry of Feb 2, 2011 but other fields associated to that date are not present? – Rahul Jul 01 '11 at 17:24
  • Possibly duplicate of http://stackoverflow.com/questions/3538858/mysql-how-to-fill-missing-dates-in-range – giorgio79 Aug 22 '12 at 20:25
  • And another dupe http://stackoverflow.com/questions/5522701/how-to-find-missing-rows-dates-in-a-mysql-table – giorgio79 Aug 30 '12 at 13:10

6 Answers6

12

This is a second answer, I'll post it separately.

SELECT DATE(r1.reportdate) + INTERVAL 1 DAY AS missing_date
FROM Reports r1
LEFT OUTER JOIN Reports r2 ON DATE(r1.reportdate) = DATE(r2.reportdate) - INTERVAL 1 DAY
WHERE r1.reportdate BETWEEN '2011-01-01' AND '2011-04-30' AND r2.reportdate IS NULL;

This is a self-join that reports a date such that no row exists with the date following.

This will find the first day in a gap, but if there are runs of multiple days missing it won't report all the dates in the gap.

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
  • 3
    Modified it slightly to account for duplicate date value that caused the self join to go crazy: `SELECT DISTINCT DATE(r1.date) + INTERVAL 1 DAY AS missing_date FROM mytable r1 LEFT OUTER JOIN (SELECT DISTINCT date FROM mytable ) r2 ON DATE(r1.date) = DATE(r2.date) - INTERVAL 1 DAY WHERE r1.date BETWEEN '2011-06-19' AND '2012-08-30' AND r2.date IS NULL` – giorgio79 Aug 30 '12 at 13:41
  • This work only if the missing date is one day but if you have more than one its fail – Martin Varta Nov 19 '14 at 17:42
  • @MartinVarta, yes that's true. I already said that in my answer above. – Bill Karwin Nov 19 '14 at 18:24
  • @BillKarwin this solution can help, can change the range. SELECT * FROM ( SELECT DATE_ADD('2014-01-01', INTERVAL t4+t16+t64+t256+t1024 DAY) missing FROM (SELECT 0 t4 UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 ) t4, (SELECT 0 t16 UNION ALL SELECT 4 UNION ALL SELECT 8 UNION ALL SELECT 12 ) t16, (SELECT 0 t64 UNION ALL SELECT 16 UNION ALL SELECT 32 UNION ALL SELECT 48 ) t64, (SELECT 0 t256 UNION ALL SELECT 64 UNION ALL SELECT 128 UNION ALL SELECT 192) t256 ) b WHERE missing NOT IN (SELECT fDesde262 FROM py262) AND ausente – Martin Varta Nov 20 '14 at 11:51
  • @MartinVarta, great, why don't you post that as an answer? – Bill Karwin Nov 20 '14 at 14:12
8
  1. CREATE TABLE Days (day DATE PRIMARY KEY);

  2. Fill Days with all the days you're looking for.

    mysql> INSERT INTO Days VALUES ('2011-01-01');
    mysql> SET @offset := 1;
    mysql> INSERT INTO Days SELECT day + INTERVAL @offset DAY FROM Days; SET @offset := @offset * 2;
    

    Then up-arrow and repeat the INSERT as many times as needed. It doubles the number of rows each time, so you can get four month's worth of rows in seven INSERTs.

  3. Do an exclusion join to find the dates for which there is no match in your reports table:

    SELECT d.day FROM Days d 
    LEFT OUTER JOIN Reports r ON d.day = DATE(r.reportdatetime) 
    WHERE d.day BETWEEN '2011-01-01' AND '2011-04-30' 
        AND r.reportdatetime IS NULL;`
    
Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
  • Hi Bill, thank you for the comment. As Sunny, mentioned, step 2 would be a challenge. Any suggestion on how to do that other than manually? – Laxmidi Jul 01 '11 at 17:59
  • I've edited the above with an example of filling the table with 128 days worth of rows. – Bill Karwin Jul 01 '11 at 19:14
0

Try this

SELECT DATE(t1.datefield) + INTERVAL 1 DAY AS missing_date FROM table t1 LEFT OUTER JOIN table t2 ON DATE(t1.datefield) = DATE(t2.datefield) - INTERVAL 1 DAY WHERE DATE(t1.datefield) BETWEEN '2020-01-01' AND '2020-01-31' AND DATE(t2.datefield) IS NULL;

If you want to get missing dates in a datetime field use this.

SELECT CAST(t1.datetime_field  as DATE) + INTERVAL 1 DAY AS missing_date FROM table t1 LEFT OUTER JOIN table t2 ON CAST(t1.datetime_field  as DATE) = CAST(t2.datetime_field  as DATE) - INTERVAL 1 DAY WHERE CAST(t1.datetime_field  as DATE) BETWEEN '2020-01-01' AND '2020-07-31' AND CAST(t2.datetime_field  as DATE) IS NULL;
Ruchira Nawarathna
  • 1,137
  • 17
  • 30
0

It could be done with a more complicated single query, but I'll show a pseudo code with temp table just for illustration:

Get all dates for which we have records:

CREATE TEMP TABLE AllUsedDates

SELECT DISTINCT reportdatetime
INTO AllUsedDates;

now add May 1st so we track 04-30

INSERT INTO AllUsedData ('2011-05-01')

If there's no "next day", we found a gap:

SELECT A.NEXT_DAY
FROM
    (SELECT reportdatetime AS TODAY, DATEADD(reportdatetime, 1) AS NEXT_DAY FROM AllUsed Dates) AS A
WHERE
    (A.NEXT_DATE NOT IN (SELECT reportdatetime FROM AllUsedDates)
    AND
    A.TODAY <> '2011-05-01') --exclude the last day
Sunny Milenov
  • 21,990
  • 6
  • 80
  • 106
0

If you mean reportdatetime has the entry of "Feb 2, 2011" but other fields associated to that date are not present like below table snap

reportdate  col1    col2
5/10/2011   abc xyz
2/2/2011        
1/1/2011    bnv oda

then this query works fine

select reportdate from dtdiff where reportdate not in (select df1.reportdate from dtdiff df1, dtdiff df2 where df1.col1 = df2.col1)
Rahul
  • 76,197
  • 13
  • 71
  • 125
  • Hi Rahul, thank you for the message. Sorry that I wasn't clearer. I meant that Feb 2, 2011 doesn't have an entry in the table. It's missing from the reportdatetime column. – Laxmidi Jul 01 '11 at 18:01
  • @Laximidi, in that case you have to have another table filled with all the dates between Jan 1, 2011 thru April 30, 2011 and then you can simply find the missing date by issueing the query `select distinct repdt from dtall where repdt >= '1/1/2011' and repdt <= '1/6/2011' and repdt not in (select distinct dt from dtdiff)` – Rahul Jul 01 '11 at 18:22
0

The solutions above seem to work, but they seem EXTREMELY slow (taking possibly hours, I waited for 30 min only) at least in my database.

This clause takes less than a second in same database (of course you need to repeat it manually dozen times and possibly change function names to find the actual dates). pvm = my datetime, WEATHER = my table.

mysql> select year(pvm) as _year,count(distinct(date(pvm))) as _days from WEATHER where year(pvm)>=2000 and month(pvm)=1 group by _year order by _year asc;

--ako

Arto Kojo
  • 1
  • 1