3

I have this SELECT:

SELECT
  DATE_FORMAT(`created`, '%Y-%m') as byMonth,
  COUNT(*) AS Total 
FROM 
  `qualitaet`
WHERE
  `created` >= MAKEDATE(year(now()-interval 1 year),1) + interval 5 month
AND
  `status`=1
GROUP BY 
  YEAR(`created`), MONTH(`created`)
ORDER BY 
  YEAR(`created`) ASC

and get this result:

| byMonth | Total |
| 2015-06 |   2   |
| 2015-09 |  12   |
| 2015-10 |   3   |
| 2015-12 |   8   |
| 2016-01 |   1   |

see SQL-Fiddle here

The WHERE clause is important because i need it as current fiscal year starting on June, 1 in my example.

As you can see, i have no records for Jul, Aug and Nov. But i need this records with zero in Total.

So my result should look like this:

| byMonth | Total |
| 2015-06 |   2   |
| 2015-07 |   0   |
| 2015-08 |   0   |
| 2015-09 |  12   |
| 2015-10 |   3   |
| 2015-11 |   0   |
| 2015-12 |   8   |
| 2016-01 |   1   |

is there a way to get this result?

bernte
  • 1,184
  • 2
  • 19
  • 34
  • 2
    you will need to either create a calendar table and use this datasource to left join to get your resultset, or use a sequence generator to construct a sequence of months and again left join.. see this post on **[creating calendar table](http://stackoverflow.com/questions/34545056/mysql-fill-in-missing-dates-between-two-dates-for-a-given-status/34545534#34545534)**. or this for **[sequence generator in mysql](http://stackoverflow.com/questions/34545297/in-mysql-is-there-any-function-like-level-in-oracle/34545764#34545764)**. both are similar, one ends up with reusable calendar table afterwards – amdixon Jan 01 '16 at 15:02
  • 1
    ...or handle the issue of missing results in the presentation layer (e.g. with a bit of PHP) – Strawberry Jan 01 '16 at 15:06
  • yes i missed that one ;) – amdixon Jan 01 '16 at 15:07

1 Answers1

2

You need to generate all the wanted dates, and then left join your data to the dates. Note also that it is important to put some predicates in the left join's ON clause, and others in the WHERE clause:

SELECT
  CONCAT(y, '-', LPAD(m, 2, '0')) as byMonth,
  COUNT(`created`) AS Total 
FROM (
  SELECT year(now())     AS y UNION ALL
  SELECT year(now()) - 1 AS y 
) `years`
CROSS JOIN (
  SELECT  1 AS m UNION ALL
  SELECT  2 AS m UNION ALL
  SELECT  3 AS m UNION ALL
  SELECT  4 AS m UNION ALL
  SELECT  5 AS m UNION ALL
  SELECT  6 AS m UNION ALL
  SELECT  7 AS m UNION ALL
  SELECT  8 AS m UNION ALL
  SELECT  9 AS m UNION ALL
  SELECT 10 AS m UNION ALL
  SELECT 11 AS m UNION ALL
  SELECT 12 AS m
) `months`
LEFT JOIN `qualitaet` q
ON YEAR(`created`) = y 
  AND MONTH(`created`) = m
  AND `status` = 1
WHERE STR_TO_DATE(CONCAT(y, '-', m, '-01'), '%Y-%m-%d') 
    >= MAKEDATE(year(now()-interval 1 year),1) + interval 5 month
  AND STR_TO_DATE(CONCAT(y, '-', m, '-01'), '%Y-%m-%d') 
    <= now()
GROUP BY y, m
ORDER BY y, m

How does the above work?

  • CROSS JOIN creates a cartesian product between all available years and all available months. This is what you want, you want all year-month combinations with no gaps.
  • LEFT JOIN adds all the qualitaet records to the result (if they exist) and joins them to the year-month cartesian product from before. It is important to put prediactes like the status = 1 predicate here.
  • COUNT(created) counts only non-NULL values of created, i.e. when the LEFT JOIN produces no rows for any given year-month, we want 0 as a result, not 1, i.e. we don't want to count the NULL value.

A note on performance

The above makes heavy use of string operations and date time arithmetic in your ON and WHERE predicates. This isn't going to perform for lots of data. In that case, you should better pre-truncate and index your year-months in the qualitaet table, and operate only on those values.

Lukas Eder
  • 211,314
  • 129
  • 689
  • 1,509
  • Hi, the result is the same as my first result :( – bernte Jan 01 '16 at 16:08
  • and now it starts at 2016-01 – bernte Jan 01 '16 at 16:11
  • now it looks nearly perfect! Only " 2015-06 | 2 " is missing :( – bernte Jan 01 '16 at 16:20
  • 1
    @bernte: Well, I was a bit confused by your interval calculation. Anyway, if you know the start date, just put that instead... In the end, my solution is just showing ideas about how to solve this kind of problem. There might be simpler solutions. – Lukas Eder Jan 01 '16 at 16:27
  • I did it because i don't want to change the code every year. with the interval calculation it jumps automatic into the next fiscal-year – bernte Jan 01 '16 at 16:31
  • 1
    OK, I see. I guess it makes sense not to hard code the years in the `years` table, then. See the update, that generates the last two years – Lukas Eder Jan 01 '16 at 16:35
  • can you say something about the performance? on table qualitaet are a lot of rows. is LPAD not a killer? maybe it ist better when i do it like year, month, total? – bernte Jan 01 '16 at 16:44
  • 1
    @bernte: LPAD isn't the killer here because you will get at most 12 rows. The killer are `YEAR(created)` and `MONTH(created)` as well as the whole `WHERE` predicate. It's probably better if you store your `created` column also as an indexed `created-year-month VARCHAR(6)` column – Lukas Eder Jan 01 '16 at 16:56