0

I have a list of accounts and their cost which changes every few days. In this list I only have the start date every time the cost updates to a new one, but no column for the end date. Meaning, I need to populate a list of dates when the end date for a specific account and cost, should be deduced as the start date of the same account with a new cost.

More or less like that:
Account   start date  cost
 one           1/1/2016   100$
 two           1/1/2016   150$
 one           4/1/2016   200$
 two           3/1/2016   200$

And the result I need would be:
Account    date    cost
 one       1/1/2016  100$
 one       2/1/2016  100$
 one       3/1/2016  100$
 one       4/1/2016  200$
 two       1/1/2016  150$
 two       2/1/2016  150$
 two       3/1/2016  200$

For example, if the cost changed in the middle of the month, than the sample data will only hold two records (one per each unique combination of account-start date-cost), while the results will hold 30 records with the cost for each and every day of the month (15 for the first cost and 15 for the second one). The costs are a given, and no need to calculate them (inserted manually).

Note the result contains more records because the sample data shows only a start date and an updated cost for that account, as of that date. While the results show the cost for every day of the month.

Any ideas?

lironavr
  • 21
  • 1
  • 4
  • 1
    how is this optimization ? – t-clausen.dk Feb 08 '16 at 08:46
  • If you know how to populate when you have start and end date, then just select max(date) and min(date) group by your account and u'll have your start and end date – sagi Feb 08 '16 at 08:48
  • 1
    @t-clausen.dk: we should optimize a non-existing query to a working one. – Tim Schmelter Feb 08 '16 at 08:48
  • The question is also not very clear. Please elaborate more on your requirement. Explain the rules, why your result contains more records than your sample data? What about the costs, how are they calculated? – Tim Schmelter Feb 08 '16 at 08:50
  • @TimSchmelter the question is not that hard to understand. Solution may be a bit harder though – t-clausen.dk Feb 08 '16 at 09:45
  • @t-clausen.dk: how is it not hard to understand? OP has not even mentioned that he wants to generate a list of consecutive days even if they are missing. You are presuming this. What if OP just has shown an incomplete source sample? – Tim Schmelter Feb 08 '16 at 09:49
  • @lironavr: You should edit your question instead of posting comments. – Tim Schmelter Feb 08 '16 at 09:50
  • @TimSchmelter well I guess I considered the comment part of the question. I agree that the question should be updated instead – t-clausen.dk Feb 08 '16 at 09:58
  • Updated the question, thanks for the feedback! – lironavr Feb 08 '16 at 10:31
  • @lironavr since you updated your question after my answer was posted, I was wondering if this answer solved your problem – t-clausen.dk Feb 08 '16 at 13:18

2 Answers2

1

Solution is a bit long.

I added an extra date for test purposes:

DECLARE @t table(account varchar(10), startdate date, cost int)

INSERT @t 
values
('one','1/1/2016',100),('two','1/1/2016',150),
('one','1/4/2016',200),('two','1/3/2016',200),
('two','1/6/2016',500) -- extra row

;WITH CTE as
( SELECT
    row_number() over (partition by account order by startdate) rn,
    *
  FROM @t
),N(N)AS 
(
  SELECT 1 FROM(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1))M(N)
),
tally(N) AS -- tally is limited to 1000 days
(
  SELECT ROW_NUMBER()OVER(ORDER BY N.N) - 1 FROM N,N a,N b
),GROUPED as
(
  SELECT
    cte.account, cte.startdate, cte.cost, cte2.cost cost2, cte2.startdate enddate
  FROM CTE
  JOIN CTE CTE2
  ON CTE.account = CTE2.account
  and CTE.rn = CTE2.rn - 1
)
-- used DISTINCT to avoid overlapping dates
SELECT DISTINCT
  CASE WHEN datediff(d, startdate,enddate) = N THEN cost2 ELSE cost END cost,
  dateadd(d, N, startdate) startdate,
  account
FROM grouped
JOIN tally
ON datediff(d, startdate,enddate) >= N

Result:

cost  startdate   account
100   2016-01-01  one
100   2016-01-02  one
100   2016-01-03  one
150   2016-01-01  two
150   2016-01-02  two
200   2016-01-03  two
200   2016-01-04  one
200   2016-01-04  two
200   2016-01-05  two
500   2016-01-06  two
t-clausen.dk
  • 43,517
  • 12
  • 59
  • 92
0

Thank you @t-clausen.dk!

It didn't solve the problem completely, but did direct me in the correct way.

Eventually I used the LEAD function to generate an end date for every cost per account, and then I was able to populate a list of dates based on that idea.

Here's how I generate the end dates:

DECLARE @t table(account varchar(10), startdate date, cost int)
INSERT @t 
values
('one','1/1/2016',100),('two','1/1/2016',150),
('one','1/4/2016',200),('two','1/3/2016',200),
('two','1/6/2016',500)

select account 
      ,[startdate]
      ,DATEADD(DAY, -1, LEAD([Startdate], 1,'2100-01-01') OVER (PARTITION BY account ORDER BY [Startdate] ASC)) AS enddate 
      ,cost
from @t

It returned the expected result:

account   startdate        enddate       cost
  one       2016-01-01    2016-01-03      100
  one       2016-01-04    2099-12-31      200
  two       2016-01-01    2016-01-02      150
  two       2016-01-03    2016-01-05      200
  two       2016-01-06    2099-12-31      500

Please note that I set the end date of current costs to be some date in the far future which means (for me) that they are currently active.

Community
  • 1
  • 1
lironavr
  • 21
  • 1
  • 4