I have a MySQL data base that includes a column of dates that are all incomplete. The nature of the data is that it is monthly data, so no day of the month has been assigned. To have a concrete example (different than the actual application), let's say it's a table of monthly sales - call it monthly_sales
- at different store locations, with columns
store_name | month_date | sales
with types varchar(20)
, date
, and float
, respectively. The month_date
column has only incomplete dates such as 2016-01-00
since the sales
column contains data that covers the whole month.
Here's code to create a toy table as described:
create table monthly_sales (store_name varchar(20), month_date date, sales float);
insert into monthly_sales values
('New York', 20160100, 30000),
('New York', 20160200, 6000),
('Boston', 20160100, 8000),
('Boston', 20160200, 3000);
The database design - for better or worse - is fixed, so there's no option to change this design now.
The question comes in trying to structure a query that compares same-store sales in consecutive months. I tried this query:
select * from monthly_sales as ms1
join monthly_sales as ms2 on ms1.store_name=ms2.store_name and ms2.month_date=date_sub(ms1.month_date, interval 1 month)
This fails to work because the documented behavior of date_sub
is that it returns null
when given incomplete dates. The desired output would have been
New York | 2016-02-00 | 6000 | New York | 2016-01-00 | 30000
Boston | 2016-02-00 | 3000 | Boston | 2016-01-00 | 8000
(Note FWIW that, in this case, the operation is conceptually well-defined because only the day is undefined whereas I'm doing the arithmetic on the months. This is analogous to taking the difference in days between two dates without regard to the time of day, if any, attached to the dates in that difference.)
I did search prior to asking, and found related but different questions. I believe that mine is different enough in the details to avoid being marked duplicate. Among the questions that I checked:
- Operations on incomplete dates in MYSQL This question is over 5 years old referring to an older version of MySQL. It is also fixed specifically on getting the difference in dates, whereas I'm open to any solution that makes the correct logical join even if I never actually compute a difference in dates.
- MySQL: Count of records with consecutive months This deals with consecutive months, but it has complete dates. (The answers also use nested sequences of sub-queries that are going to kill my real application.)
What work-arounds are available here? I don't care about doing the date arithmetic for its own sake and am open to alternate ways of forming the join to accomplish the stated goal.