0

Been trying to sort this one out for a while. I'd really appreciate any help.

I've got this table where I'm getting 2 columns with date and int values respectively. The problem is that mysql skips the date values wherever the int value is null.

Here the sql statement

SELECT DATE_FORMAT(sales_date_sold, '%b \'%y')
    AS sale_date, sales_amount_sold 
    AS sale_amt 
    FROM yearly_sales 
    WHERE sales_date_sold BETWEEN DATE_SUB(SYSDATE(), INTERVAL 2 YEAR) AND SYSDATE() 
    GROUP BY YEAR(sales_date_sold), MONTH(sales_date_sold) 
    ORDER BY YEAR(sales_date_sold), MONTH(sales_date_sold) ASC;

There aren't any values for feb 2011 so that month gets skipped, along with a few others. Coalesce and if_null don't work too.

Andrew Lee
  • 2,543
  • 3
  • 20
  • 31
goonerify
  • 1,668
  • 25
  • 27

4 Answers4

1

You need a row source that provides values for all of the months in the dimension, and then left join your yearly_sales table to that.

You are doing a GROUP BY, you most likely want an aggregate on your measure (sales_amount_sold), or you don't want a GROUP BY. (The query in your question is going to return a value from sales_amount_sold for only one row in a given month. That may be what you want, but its a very odd resultset to return.)


One approach is to have a "calendar_month" table that contains DATE values all of the months you want returned. (There are other ways to generate this, existing answers to questions elsewhere on stackoverflow)

SELECT m.month                            AS sale_date
     , IFNULL(SUM(s.sales_amount_sold),0) AS sale_amt
  FROM calendar_months m
  LEFT
  JOIN yearly_sales s
    ON s.sales_date_sold >= m.month 
   AND s.sales_date_sold < DATE_ADD(m.month,INTERVAL 1 MONTH)
 WHERE m.month BETWEEN DATE_SUB(SYSDATE(), INTERVAL 2 YEAR) AND SYSDATE()
 GROUP BY m.month
 ORDER BY m.month

This query returns a slightly different result, you are only going to get rows in groups of "whole months", rather than including partial months, as in your original query, because the WHERE clause on sale_date references two years before the current date and time, rather than the "first of the month" two years before.

A calendar_months table is not necessarily required; this could be replaced with a query that returns the row source. In that case, the predicate on the month value could be moved from the outer query into the subquery.


Addendum: if you use a calendar_month table as a rowsource, you'll need to populate it with every possible "month" value you want to return.

CREATE TABLE calendar_month
(`month` DATE NOT NULL PRIMARY KEY COMMENT 'one row for first day of each month');

INSERT INTO calendar_month(`month`) VALUES ('2011-01-01'),('2011-02-01'),('2011-03-01') 

As an alternative, you can specify a dynamically generated rowsource, as an inline view, rather than a table reference. (You could use a similar query to quickly populate a calendar_months table.)

You can wrap this query in parenthesis, and paste it between FROM and calendar_months in the previous query I provided.

SELECT DATE_ADD('1990-01-01',INTERVAL 1000*thousands.digit + 100*hundreds.digit + 10*tens.digit + ones.digit MONTH) AS `month`
  FROM ( SELECT 0 AS digit UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9 ) ones
  JOIN ( SELECT 0 AS digit UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9 ) tens
  JOIN ( SELECT 0 AS digit UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9 ) hundreds
  JOIN ( SELECT 0 AS digit UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9 ) thousands
spencer7593
  • 106,611
  • 15
  • 112
  • 140
  • Got an empty result set. Isn't it supposed to be (select yearly_sales s....left join calender_months... – goonerify Jun 26 '12 at 19:01
  • You'll need to populate calender_months with the first day of every month. You want the calendar_months row source to be the "driver", so it needs to be on the left side of a LEFT JOIN. – spencer7593 Jun 26 '12 at 19:04
  • Added example of dynamically generated inline view as rowsource for month values. – spencer7593 Jun 26 '12 at 19:27
  • Wow! Thanks that's really helpful. I've been working on an sp to use for this but i tried this and it seems to work – goonerify Jun 26 '12 at 23:01
  • The query with the reference/lookup calendar_months table is simpler, BUT you need to maintain that, and ENSURE that it has all the rows you need, that it doesn't get duplicates, and that it only contains rows for midnight of the first of the month. You can workaround some potential corruption of the table with a careful query in place of the table reference (SELECT DISTINCT DATE_FORMAT(c.month,'%Y-%m-01') AS month FROM calendar_months c) – spencer7593 Jun 27 '12 at 00:35
  • I wrote an sp and event for this, but even that is not working. If you've got the time, perhaps you could take a look at it. I just posted it on stackoverflow – goonerify Jun 27 '12 at 13:37
1

The problem is not that the value is NULL, the problem is that you are selecting data off your Database. If you don't have data for a specific month, MySQL has no way of selecting data that is not there.

The only way to solve this completely in MySQL is already answered in a very similar question

Community
  • 1
  • 1
Toote
  • 3,323
  • 2
  • 19
  • 25
0

I have had this problem before with timestamps. The solution I used was to create a reference table with all of your months. This could be a table with just the numbers 1-12 (12 rows) or you could go one step further and put the month names. Then you can left join your yearly_sales table to the 1_through_12 table to get every month.

Matt Dodge
  • 10,833
  • 7
  • 38
  • 58
0

Why don't you just use 0 instead of NULL?

  • You're right, i think a stored procedure and event can loop through the data to check for null values and insert zero values. Seems a bit like overkill for such a problem though – goonerify Jun 26 '12 at 19:06