3

I have a table like below:

Book        BookSales    Date
BookA       $1           2013-03-04
BookB       $2           2013-03-04
BookA       $3           2013-03-05

... ...

I run the query -

select book,booksales,date 
from tblBook 
where date>='03/03/2013'  
and date<='03/05/2013' 
and book in ('BookA','BookB') 
order by date,book

Now it returns the result as below

BookA     $1            2013-03-04
BookB     $2            2013-03-04
BookA     $3            2013-03-05

But what i want is returning the empty result($0) for book b as well if it does not exists for that date as shown below). Here, the record for book b does not exist for date 2013-03-05.

BookA     $1            2013-03-04
BookB     $2            2013-03-04
BookA     $3            2013-03-05
BookB     $0            2013-03-05  <-- I want to include this record

How can I achieve this ? If it is not possible to achieve this way, what other approach can I consider?

Erran Morad
  • 4,563
  • 10
  • 43
  • 72

3 Answers3

3

There must be an easier way of doing this, but here's what I came up with. Use common table expressions to get a distinct list of dates and a distinct list of books, then CROSS JOIN them to get the full matrix of dates and books. Once you've got the full matrix, you can OUTER JOIN your table to it, replacing any nulls with 0.

Cross joins can get messy pretty quickly, but if it's just a list of dates and list of books it should work fine.

;with dates AS
(
  select distinct date
  from tblBook
), books AS
(
  select distinct book
  from tblBook
)
select b.book, coalesce(tb.booksales, 0), d.[date]
from dates as d
cross join books as b
left outer join tblBook as tb
  on d.[date] = tb.[date]
    and b.book = tb.book
where d.[date] >='03/03/2013'  
and d.[date] <='03/05/2013' 
and b.book in ('BookA','BookB') 
order by d.[date], b.[book]

Working SQLFiddle here.

Jeff Rosenberg
  • 3,522
  • 1
  • 18
  • 38
2

Resolving Missing Aggregation Data With Dimensional Sub Queries and Outer Join SQL

You can solve your problem by using a sub query that produces all the distinct values that you need to report on. I am assume that the output is supposed to provide a way to track the sales of each book on each day, even if no sales transactions have been entered into the table. Here is one possible solution and the output:

Problem Analysis:

The request to see a data value for which there is no physical representation in the queried table is possible by constructing a sub query that represents all possible values whether or not they already exist in the table.

In this case, the two dimensions reported on are: "book" and "date". There must be an entry for total sales for each book for each day of transactions recorded in the book sales table.

Assumptions: It wasn't clear in the example if tblBook represents already aggregated sales, or if there can be multiple records for a given "book" and "date" combination... as in the style of a transaction register that records each sale as they happen. The solution query aggregates sales amounts to provide a single record for each book on each day.

Note: This example was tested on a MySQL RDBMS. The SQL used in this example is ANSI standard, so it should work on a SQLServer database as well. You may have to find the equivalent to the ifNULL() function used in this solution

The SQL Code

 SELECT REPORTING_DIMENSIONS.book, 
    sum(ifNULL(sales_data.bookSales,0)) as totalSales,
    REPORTING_DIMENSIONS.date

 FROM 
   (SELECT book_list.book, date_list.date 
      FROM
        (SELECT distinct book
           FROM tblBook
           ) book_list

       CROSS JOIN
       (SELECT distinct date
          FROM tblBook
          ) date_list
   ) REPORTING_DIMENSIONS

   LEFT JOIN tblBook SALES_DATA
   ON REPORTING_DIMENSIONS.book = SALES_DATA.book
      AND REPORTING_DIMENSIONS.date = SALES_DATA.date

 GROUP BY REPORTING_DIMENSIONS.book, 
    REPORTING_DIMENSIONS.date

The Output:

 |  BOOK | TOTALSALES |                         DATE |
 |-------|------------|------------------------------|
 | BookA |          2 | March, 04 2013 00:00:00+0000 |
 | BookA |          6 | March, 05 2013 00:00:00+0000 |
 | BookB |          4 | March, 04 2013 00:00:00+0000 |
 | BookB |          0 | March, 05 2013 00:00:00+0000 |

Discussion of Results

The value of zero for BookB on 2013-03-15 actually was a NULL value. When an OUTER (i.e., LEFT) join is made for which there is no match on the joining values, a null is returned.

The only fancy step was the CROSS JOIN designation. Also known as a CARTESIAN PRODUCT, this generates a record for each possible "book" and "date" combination recorded in the existing table.

The sub query aliased as REPORTING DIMENSIONS is a fixed, dynamic feature of the query. It allows the query to adjust its output as new books or dates are added to its scope.

Additional Considerations:

It is possible to have NO SALES for NONE of the products on one, all or many of the calendar days contained in the range of your report query. This again will affect the output of the original report query because there will be holes in the output of results.

This can be fixed by creating a fixed dimension table just for date values. Pre populate it with a record for each discrete calendar day for a few years before and after the scope of your sales data. A simple T/SQL block with a looping insert operation can easily populate this table. Once it's been populated, it is very unlikely it needs to be touched.

Richard Pascual
  • 2,023
  • 1
  • 14
  • 22
1

If you have this

Book        BookSales    Date
BookA       $1           2013-03-04
BookB       $2           2013-03-04
BookA       $3           2013-03-06

and want

BookA     $1            2013-03-04
BookB     $2            2013-03-04
BookA     $0            2013-03-05
BookB     $0            2013-03-05
BookA     $3            2013-03-06
BookB     $0            2013-03-06

Stolen from Jeff Rosenberg's answer & t-sql get all dates between 2 dates

;WITH dates AS
(
  SELECT CAST('2013-03-04' AS DATETIME) AS [date]
  UNION ALL
  SELECT DATEADD(dd, 1, [date])
    FROM dates s
   WHERE DATEADD(dd, 1, [date]) <= CAST('2013-03-06' AS DATETIME)), books AS
(
  select distinct book
  from tblBooks
)
select b.book, coalesce(tb.booksales, 0), d.[date]
from dates as d
cross join books as b
left outer join tblBooks as tb
  on d.[date] = tb.[date]
    and b.book = tb.book

And if you need aggregation

...
    SELECT b.Book, COALESCE(SUM(tb.BookSales), 0), d.[Date]
    FROM Dates d
        CROSS JOIN Books b
        LEFT JOIN tblBooks tb ON d.[Date] = tb.Date
            AND b.Book = tb.Book
    GROUP BY b.Book, d.[Date]
Community
  • 1
  • 1
EarlOfEnnui
  • 555
  • 4
  • 7