0

I have a table with Date, Category, and Values. It is similar like this:

Date            Category       Sell
01/01/21        Apple           1
01/01/21        Orange          3
01/01/21        Banana          2
02/01/21        Banana          1
03/01/21        Orange          2

What I would like to do is to create a date row for each category even on the day that nothing was sell. The desired output should have the granularity of date and category:

Date            Category       
01/01/21        Apple           
01/01/21        Orange          
01/01/21        Banana          
02/01/21        Apple           
02/01/21        Orange          
02/01/21        Banana     
03/01/21        Apple           
03/01/21        Orange          
03/01/21        Banana 

    

I would like to have it in this format, but I'm not sure what is the best way to tackle this? I'm quite new to SQL so not sure what to search for. I am using Redshift SQL. My idea is to create a separate table of date and then separate table of category and then join it (but maybe there is a better approach)?

CREATE TABLE #DateIntervals (
    PeriodStartDate date,
    PeriodEndDate date
)

DECLARE @StartDate datetime = '01/01/2021';

DECLARE @EndDate datetime = DATEADD(DAY, -1, DATEADD(YEAR, 5, @StartDate));

WHILE @StartDate <= @EndDate

BEGIN
    INSERT INTO #DateIntervals (PeriodStartDate, PeriodEndDate)
    SELECT @StartDate, DATEADD(day, 6, @StartDate)

    SET @StartDate = DATEADD(day, 7, @StartDate)
END

However there is an error with this code as it said "Invalid operation: syntax error at or near "DECLARE" Position"

Would be greatly appreciated if anyone could give me a suggestion.

tlqn
  • 349
  • 1
  • 6
  • 18

1 Answers1

0

Use a cross join to generate the rows and then a left join to bring in the data. The first part seems to answer your question:

select d.date, c.category
from (select distinct date from t) d cross join
     (select distinct category from t) c
order by d.date, c.category;

To bring in the existing data, use left join:

select d.date, c.category,
       coalesce(t.sell, 0) as sell
from (select distinct date from t) d cross join
     (select distinct category from t) c left join
     t
     on d.date = t.date and c.category = t.category
order by d.date, c.category;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • thank you very much for your suggested solution. However with the first query that you wrote, if I understood correctly, it will not populate the date where none of the category was sold, right? And I would like to have a row per date per category level – tlqn Feb 17 '21 at 11:45
  • @tlqn maybe generating the date sequence then left joining to it will help https://stackoverflow.com/questions/7824831/generate-dates-between-date-ranges – Liam Kernighan Feb 17 '21 at 11:52
  • @tlqn . . . Correct. The sample data in the question has all the dates that are used for the results. If you want to fill in missing dates, I would suggest (in Redshift) using a `calendar` table. – Gordon Linoff Feb 17 '21 at 13:31