1

Started with TSQL last Wednesday...

I have the following data in tblStage1:

PROJECT    USERNAME           DATE          PERCENTAGE                  
---------  -----------------  ------------  ---------------------- 
Project 1  DOMAIN\Chris.User  03/01/2013    0.25                     
Project 1  DOMAIN\Chris.User  05/01/2013    0.75                     
Project 1  DOMAIN\Chris.User  07/01/2013    1                     
Project 1  DOMAIN\John.User   02/01/2013    1                     
Project 1  DOMAIN\John.User   06/01/2013    0.5                     

I have the following data in tblRawData

PROJECT    START_DATE   END_DATE
---------- -----------  ----------
Project 1  01/01/2013   09/01/2013

I would like to get the following data out into tblStage2 (data points are bound by START_DATE and END_DATE):

PROJECT    USERNAME           DATE          PERCENTAGE                  
---------  -----------------  ------------  ---------------------- 
Project 1  DOMAIN\Chris.User  01/01/2013    0                     
Project 1  DOMAIN\Chris.User  02/01/2013    0                         
Project 1  DOMAIN\Chris.User  03/01/2013    0.25                     
Project 1  DOMAIN\Chris.User  04/01/2013    0.25                      
Project 1  DOMAIN\Chris.User  05/01/2013    0.75                     
Project 1  DOMAIN\Chris.User  06/01/2013    0.75                     
Project 1  DOMAIN\Chris.User  07/01/2013    1                     
Project 1  DOMAIN\Chris.User  08/01/2013    1                     
Project 1  DOMAIN\Chris.User  09/01/2013    1                  
Project 1  DOMAIN\John.User   01/01/2013    0                     
Project 1  DOMAIN\John.User   02/01/2013    1                         
Project 1  DOMAIN\John.User   03/01/2013    1
Project 1  DOMAIN\John.User   04/01/2013    1                      
Project 1  DOMAIN\John.User   05/01/2013    1                     
Project 1  DOMAIN\John.User   06/01/2013    0.5                     
Project 1  DOMAIN\John.User   07/01/2013    0.5                     
Project 1  DOMAIN\John.User   08/01/2013    0.5                     
Project 1  DOMAIN\John.User   09/01/2013    0.5                  

I realize that there are a number of topics that relate to this subject such as this. I my case, I don't have any particular restrictions and I am looking for a clean routine that is relatively easy to understand.

I know there is a DateAdd function, but I haven't seen any INSERT INTO commands in the example statements. I am confused as to how one would iterate through the data set and create the interpolated values. I am still too green to understand the full context of the other examples and would greatly appreciate any help or clarification.

Edit Added additional information to the sample data for a better indication of my final goal. I will have multiple users in this data set. The USERNAME column is placed into the data set by the original source (a people picker on an InfoPath form). All "Percentages" are "0" until the first value is assigned then they retain that value until it is changed or the project reaches its end date. I hope this helps clarify!

Community
  • 1
  • 1
Shrout1
  • 2,497
  • 4
  • 42
  • 65

3 Answers3

2

It's not clear how USERNAME populated. I assume you have same USERNAME on project here. CTE below just build a DATE table, if you have your own date table, you can skip this part.

SQL Fiddle

DECLARE @ENDDate DATETIME

SELECT @ENDDate =  MAX(END_DATE) FROM tblRawData

;WITH tblDate AS 
( SELECT CAST(MIN(START_DATE) AS DATE) AS [date]
  FROM tblRawData 
  UNION ALL
  SELECT DATEADD(month,1,[DATE])
  FROM tblDate
  WHERE [DATE] < @ENDDate
 )
SELECT
  d.[date]
  ,r.[Project]
  ,UserName =   (SELECT MAX(USERNAME) FROM tblStage1 ts WHERE r.PROJECT = ts.PROJECT)
  ,Percentage = (SELECT ISNULL(MAX(Percentage),0) FROM tblStage1 ts WHERE r.PROJECT = ts.PROJECT AND ts.[date] <= d.[date])
FROM tblDate d
INNER JOIN tblRawData r
  ON d.[date] between r.[START_DATE] AND r.[END_DATE]
ORDER BY 2,1
OPTION (Maxrecursion 0) 

EDIT: Just found out the date is increased by month. I update the CTE query. However, you need make sure you have all project start and end date at first day of the month.


EDIT Base on your new sample date. the query became a little ugly now, however, it works. I can not think better solution right now.

New SQL Fiddler

DECLARE @ENDDate DATETIME

SELECT @ENDDate =  MAX(END_DATE) FROM tblRawData

;WITH tblDate AS 
( SELECT CAST(MIN(START_DATE) AS DATE) AS [date]
  FROM tblRawData 
  UNION ALL
  SELECT DATEADD(month,1,[DATE])
  FROM tblDate
  WHERE [DATE] < @ENDDate
 )
,ProjectList AS (
 SELECT Project,UserName
 FROM tblStage1
 GROUP BY Project,UserName
)
,cte AS (
SELECT
  d.[date]
  ,r.[Project]
  ,UserName = pl.Username
  ,CloseDate = (SELECT MAX(ts.[date]) FROM tblStage1 ts WHERE r.PROJECT = ts.PROJECT AND ts.UserName = pl.UserName AND ts.[date] <= d.[date])
FROM tblDate d
INNER JOIN tblRawData r
  ON d.[date] between r.[START_DATE] AND r.[END_DATE]
CROSS APPLY ProjectList pl
)
SELECT cte.[date],cte.project,cte.UserName,ISNULL(t.[PERCENTAGE],0) AS PERCENTAGE
FROM cte
LEFT JOIN tblStage1 t
 ON cte.PROJECT = t.PROJECT 
  AND cte.UserName = t.UserName  
  AND cte.CloseDate = t.[Date]
ORDER BY 2,3,1
EricZ
  • 6,065
  • 1
  • 30
  • 30
  • I am using the 101 date format, not the 103 - apologies if my original data set is somewhat vague. I will try to adapt this code! – Shrout1 May 07 '13 at 18:21
  • I realized just a moment ago that my data set wasn't a complete representation of the problem (I've updated it above). Are there any good sites that breakdown how CTE works so that I can adapt your code to my data set? Thanks again :) – Shrout1 May 07 '13 at 18:39
  • 1
    @Chris, I updated my answer. Please check if that works for you now. – EricZ May 08 '13 at 01:29
  • Thanks! I tried to update the SQL Fiddle data source, but for some reason it fails when I click "Execute SQL". The button greys out and nothing happens. In SQL Server 2008 my results are odd... It seems to only pick up on the first percentage value and then it repeats that value until the "End_Date". I.E. John.User has a "0" percentage value for "01/01/2013" and then a "1" percentage value for all dates thereafter. I think something is configured incorrectly on our server... I will keep researchning CTE! And thank you again! – Shrout1 May 08 '13 at 12:58
  • @chris, it may because the DATE in stage table is not always first day of the month. with your sample data, I assume you have all project date at the first day of the month, otherwise, you may get wrong result. – EricZ May 08 '13 at 13:08
  • I will link to some screenshots of my "live" data and the results in SQL server. The `Start_Date` and `End_Date` for this data are `2013-01-01` and `2014-05-01` respectively. **[Source Data](http://img.photobucket.com/albums/v34/Shrout1/DateFillRawData.jpg) and [Filled Data](http://img.photobucket.com/albums/v34/Shrout1/DateFillResults.png) screenshots**. I have added the "intended" values in red (MSPaint FTW!). Again, I can't seem to execute the fiddle once I change the source data, so it's hard to say how the fiddle would handle this. – Shrout1 May 08 '13 at 14:18
  • 1
    @Chris, once you changed source data, you need "Build Schema" before you can execute the query – EricZ May 08 '13 at 14:25
  • Yeah, I click "Build Schema" and then click "Run SQL" - the button changes to "Executing SQL" and eventually it returns a blank section with the sentence "Did this query solve the problem? If so..." I will try it on a different computer / different network and see if it fixes. Again, thank you for your help. Perhaps CTEs will click with me today and I'll be able to pinpoint exactly what I am doing wrong. – Shrout1 May 08 '13 at 17:01
  • Ok - the last select statement: `SELECT cte.[date],cte.project,cte.UserName,ISNULL(t.[PERCENTAGE],0) AS PERCENTAGE FROM cte` Do criterion need to be specified as to which `[PERCENTAGE]` field is being returned? It seems like the SQL Server is always returning the first `[PERCENTAGE]` value for that User_Name but it never catches the change. – Shrout1 May 08 '13 at 18:56
  • @Chris, I'm not sure what you mean, can you please create a SQL fiddler that with your data and query. Add the link as part of your question, so I can understand you better. – EricZ May 08 '13 at 19:17
  • I will try to do that; I may have to post from a different computer; apologies. Meanwhile, I modified the last select statement as such: `SELECT cte.[date],cte.CloseDate,cte.project,cte.UserName,ISNULL(t.[PERCENTAGE],0) AS PERCENTAGE FROM cte LEFT JOIN tblStage1 t ON cte.PROJECT = t.PROJECT AND cte.UserName = t.UserName --AND cte.CloseDate = t.[Date] ORDER BY 2,3,1`. Look at [this screenshot of the results](http://img.photobucket.com/albums/v34/Shrout1/CTEReturn.png). There are duplicate `[PERCENTAGE]` values present; only the first is ever returned. – Shrout1 May 08 '13 at 19:22
  • Ok got a fiddle to work - the browser on my office PC doesn't play nice with it. [SQL Fiddle](http://sqlfiddle.com/#!3/3d9ee/2) – Shrout1 May 09 '13 at 01:46
  • 1
    @Chris, why you comment out this line? " AND cte.CloseDate = t.[Date] ". If you add it back, you should have what you want. – EricZ May 09 '13 at 01:58
  • For some reason, SQL Server and SQL Fiddler execute the same code but get different results. I am going to start a new question on this as your statement *should* be the answer. Thank you again for all your help! I wish there was more I could do to repay you; I can't tell you how much I appreciate the time you've spent on this. – Shrout1 May 09 '13 at 11:18
  • Ok I figured it out! And (no surprise) it was my fault. My table had the "Date" field as "nvarchar(10)" and not "datetime". One day at a time... – Shrout1 May 09 '13 at 12:11
  • @Chris, I'm glad that you found out the root cause. Cheers! – EricZ May 09 '13 at 20:20
1

You could do this with the APPLY operator as well...

WITH DateList (Project, MonthStart) AS
    (
        SELECT 
            project
            , Start_Date
        FROM 
            tblRawData
        UNION ALL
        SELECT 
            dl.Project
            , DATEADD(MONTH, 1, dl.MonthStart)
        FROM 
            DateList dl
        JOIN
            tblRawData r 
            ON 
                r.Project = dl.Project
                AND 
                dl.MonthStart < r.End_Date
    )
SELECT 
    dl.Project
    , lastUser.UserName 
    , dl.monthstart [Date]
    , ISNULL(pct.Percentage, 0) Percentage
FROM 
    DateList dl
CROSS APPLY
    (
        SELECT 
            TOP 1 
            USERNAME
        FROM
            tblStage1 t1
        WHERE 
            t1.Project = dl.Project
        ORDER BY 
            t1.Date DESC
    ) lastUser
OUTER APPLY
    (
        SELECT 
            TOP 1 
            Percentage
        FROM
            tblStage1 t2
        WHERE 
            t2.Project = dl.Project
            AND 
            t2.Date <= dl.MonthStart
        ORDER BY 
            t2.Date DESC
    ) pct
JAQFrost
  • 1,431
  • 8
  • 8
0

A really easy way would be to create a temp table with the month values of each month you need then doing a left join on that table.

Out of curiosity - why is August 1 instead of 0?

Jeff B
  • 535
  • 1
  • 6
  • 15
  • The dates always fill into the future. Any months before the first assigned value are "0". Any months after an assigned value are a repeat of the value that was previously assigned. This value repeats until the next assigned value or until the "End_Date". How could I make a temp table that has all the month values in it? CTE? – Shrout1 May 07 '13 at 18:20