0

For example purposes, suppose I have a table containing today's sales total, such as:

CREATE TABLE 
(
     MyDate DateTime, 
     SalesTotal Decimal(10,2)
)

Data gets input M-F but not on weekends. So there will always be gaps in the data for some missing dates, making this query more difficult.

I want to write a query that shows every day's SalesTotal compared to [the day before's sales total], in percentages. For example:

12-22  $100 in sales was 1% greater than day before
12-21  $99 in sales was 5% greater than day before
12-20  $94 in sales was -3% greater than day before
12-19  $97 in sales was 1% greater than day before
12-18  $96 in sales was 4% greater than day before
--- notice the weekend gap here ---
12-15  $92 in sales was 2% greater than day before

I believe I need to join the table to itself or use a subquery but I'm stuck on how to get yesterday's data appended in each row to today's data so that I can do the otherwise simple math calculation, especially since yesterday's date is not always the "today's" date minus 1. And the natural order of the data is not sequential (and it is more complicated than my example) so I don't want to use ROW_NUMBER to bind to the previous day's data. What I do know is that "yesterday's" data is always [the last date prior to "today's data" even if not 1 day before]. How can I bind today's data to yesterday's data so I can compare the sales total?

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
stackonfire
  • 1,525
  • 4
  • 17
  • 23

3 Answers3

2

Given that you make reference to the row_number window function in your question, I am going to assume that you are on at least SQL Server 2012.

Sample Data:

create table #myTable 
    (
        MyDate datetime
        , SalesTotal decimal(10,2)
        )

insert into #myTable
values ('2017-12-14', 90.00)
, ('2017-12-15', 92.00)
, ('2017-12-18', 96.00)
, ('2017-12-19', 97.00)
, ('2017-12-20', 94.00)
, ('2017-12-21', 99.00)
, ('2017-12-22', 100.00)

Answer:

You can take advantage of the lag/lead window functions to compare values between rows. Note that the sub-query has the MyDate value of 2017-12-14 in the record set because it needs to be present to get the SalesTotal value for comparison, then the top level query filters that record out.

declare @bgn_dt date = '2017-12-15' --set by OP
    , @end_dt date = '2017-12-22' --set by OP
    , @lag_dt date;

set @lag_dt = (select max(MyDate) from #myTable where MyDate < @bgn_dt) --get the "yesterday" that the @bgn_dt will need

select a.MyDate
, a.SalesTotal
, format(((1.0 * a.SalesTotal) / a.SalesTotalPrevDay) - 1, '0%') as SalesTotalChange
from (
    select t.MyDate
    , t.SalesTotal
    , lag(t.SalesTotal, 1, NULL) over (/*partition by (if needed)*/ order by t.MyDate asc) as SalesTotalPrevDay
    from #myTable as t
    where 1=1
    and t.MyDate between @lag_dt and @end_dt
    ) as a
where 1=1
and a.MyDate >= @bgn_dt

Output:

+-------------------------+------------+------------------+
|         MyDate          | SalesTotal | SalesTotalChange |
+-------------------------+------------+------------------+
| 2017-12-15 00:00:00.000 | 92.00      | 2%               |
| 2017-12-18 00:00:00.000 | 96.00      | 4%               |
| 2017-12-19 00:00:00.000 | 97.00      | 1%               |
| 2017-12-20 00:00:00.000 | 94.00      | -3%              |
| 2017-12-21 00:00:00.000 | 99.00      | 5%               |
| 2017-12-22 00:00:00.000 | 100.00     | 1%               |
+-------------------------+------------+------------------+

Update:

In response to Pரதீப்'s comment, I thought I'd explain why someone might use where 1=1 in their query even though it seems unnecessary. Most of the time it is used to initialize the where clause in SQL statement built at run time (Dynamic SQL). Another use for it is in development/debugging efforts where you may be commenting in/out various constraints including the first listed in the where clause. In the end, it has no impact on the performance of the query, but may make your life easier.

Update 2:

To explain further as to why, per stackonfire's description, the first record is always null on the inner query. All window functions are only concerned with records that are in the current query.

For example, if you look at the query below, you would want row_number to return 1, 2, 3 (as opposed to 1, 2, 4). The same is true of a lag/lead function, that some record has to be the first/last therefore there is no previous/next record to retrieve a value from.

I have updated my initial answer to add the dates necessary to the where clause.

create table #letters
    (
        letter char(1)
    )

insert into #letters
values ('a'), ('b'), ('c'), ('d')

select *
, row_number() over (order by l.letter asc) as row_nbr
from #letters l
where l.letter <> 'c'
tarheel
  • 4,727
  • 9
  • 39
  • 52
  • +1 but `where 1=1` is totally unnecessary unless some conditions are dynamically added later. Also `datename(weekday, t.MyDate)` is already a static condition so there is no need for `1=1` there. At least mention why you have added this condition. This could cause confusion to new learners – Pரதீப் Dec 24 '17 at 03:58
  • @Pரதீப் Fair point on beginners possibly getting distracted by seemingly unnecessary code. My explanation has been added in the answer. – tarheel Dec 24 '17 at 06:35
  • @tarheel Thanks very much for showing a good use of lag. I did some testing and found it to work, however, a couple questions: a) Why have you included the Saturday and Sunday condition in the subquery? As mentioned in the question, sometimes even weekdays could be missing (e.g. holiday). b) I also find that the subquery returns null for SalesTotalPrevDay for the first record in the date range. Your comment suggests you had 12-14-2017 in the subquery, which makes sense, but that's now showing in your code. – stackonfire Dec 24 '17 at 08:23
  • To help others, I also found that if you have another column in your data, such as a store location, then you'd want to change the `order by t.MyDate asc` to `partition by StoreLocation order by t.MyDate asc`. If you don't, it will find the previous day's data for any StoreLocation rather than for the applicable store in the second day's row. – stackonfire Dec 24 '17 at 08:29
  • @stackonfire To answer your questions a) it is because I misunderstood the sample scenario. I have removed that from the where clause as it is not necessary given your data model. b) The first query will always return `null` or any value specified in the 3rd argument of the `lag`/`lead` function because all windows functions are only concerned with only the records selected. I added further explanation on this in the answer above. Lastly, you are correct, that if you add the need to separate sales by store to the sample scenario, the `partition by` will be needed. – tarheel Dec 24 '17 at 22:17
2

Please try using below query to get the desired results.

If the table is in the below form

create table compare 
(  date1 date,
   amount decimal(10,2)
);

Use below query to get the percentage change compared with previous date

SELECT t1.rno, 
       t1.date1, 
       ( ( t1.amount - t2.amount ) / t1.amount ) * 100 AS percentage 
FROM   (SELECT Row_number() OVER(ORDER BY date1) RNO, 
               date1, 
               amount 
        FROM   compare) t1 
       INNER JOIN (SELECT Row_number() OVER(ORDER BY date1) RNO, 
                          date1, 
                          amount 
                   FROM   compare) t2 
               ON t2.rno + 1 = t1.rno 
ORDER  BY t1.rno 

Please leave a message for any changes

Pரதீப்
  • 91,748
  • 19
  • 131
  • 172
1

Try working with this

    CREATE TABLE MyTable (MyDate DateTime, SalesTotal Decimal(10,2) )

    insert into myTable select '20171222 13:00',    100 union all
      select '20171222 13:01',  80  union all
      select '20171222 13:02',  50  union all
      select '20171221 10:00',  50  union all
      select '20171221 10:01',  75  union all
      select '20171221 10:02',  90  union all
      select '20171225 10:02',  45  union all
      select '20171225 10:03',  75  union all
      select '20171225 10:04',  60   



    select  sales.Date, sales.total,  
format(((1.0 * Sales.Total) / lead(sales.total, 1, Null) over (order by sales.Date asc) ) - 1, '0%') as SalesTotalChange

     from(
    select cast (a.MyDate as date) as 'Date', sum(salesTotal) total
    from MyTable a 
    group by cast (a.MyDate as date)) sales

EDIT: copy format from last answer

Horaciux
  • 6,322
  • 2
  • 22
  • 41