3

I am brand new to SQL Server (like 10 mins). I have a test database that contains a few columns (Units, Date, Master ID, AVE Monthly Sales). The Average Monthly sales is empty.

Background: I import data from Excel files, the files are built from other files, and the average sales calculation time is Excel is driving me crazy, thus this post.

What I would like to do is populate the Average Sales column with the average Units for the past 6 months based on the date in the relevant row. i.e. if the date is 2016-06-31, I should get the average unit sales for that particular Master ID (Master ID is not my unique key - fyi) from the 2016-01-01 to and including the 2016-06-31.

As mentioned I am new to SQL, I am using SQL Server Express (running Locally), the data table (SALES) has about 8m rows. I am not sure if a calculation like this should be done is SQL, but I would like to try so I can check if it is in anyway faster than my current approach (Excel).

It would be great if someone can explain this with an example.

Thank you. Mark.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459

2 Answers2

0

Replace someuniquefield with your primary key column,also test it with only the select first.

UPDATE t SET [AVE Monthly Sales] = x.avgsum FROM table t JOIN
(SELECT someuniquefield ,
  AVG(CASE WHEN date > DATEADD(m, -6, date - DATEPART(d, date) + 1) THEN Units END) OVER (ORDER BY Date DESC) avgsum 
    FROM table  ) as x
ON x.someunqiuefield.=t.someuniquefield
Mihai
  • 26,325
  • 7
  • 66
  • 81
0

I'm assuming that you're going to want this to be dynamic rather than fixed values (to allow for data to be modified and the correct value shown). The example below shows how I would do this in a SELECT query (without modifying the underlying data);

Test data (# = temporary table)

CREATE TABLE #TestTable (Units int, Date_Field datetime, Master_ID int)
INSERT INTO #TestTable (Units, Date_Field, Master_ID)
VALUES
 (10,'2016-07-06',1)
,(20,'2016-08-06',1)
,(30,'2016-09-06',1)
,(40,'2016-10-06',1)
,(50,'2016-11-06',1)
,(60,'2016-12-06',1)
,(70,'2016-10-06',2)
,(80,'2016-11-06',2)
,(90,'2016-12-06',2)

Query

SELECT
     tt.Master_ID
     ,tt.Date_Field
     ,tt.Units
     ,SUM(tt2.Units) Sum_Units
FROM #TestTable tt
LEFT JOIN #TestTable tt2
    ON tt2.Date_Field BETWEEN DATEADD(m,-3,tt.Date_Field) AND tt.Date_Field
    AND tt.Master_ID = tt2.Master_ID
GROUP BY tt.Master_ID, tt.Date_Field, tt.Units
ORDER BY tt.Master_ID, tt.Date_Field

Output

Master_ID   Date_Field                  Units   Sum_Units
1           2016-07-06 00:00:00.000     10      10
1           2016-08-06 00:00:00.000     20      30
1           2016-09-06 00:00:00.000     30      60
1           2016-10-06 00:00:00.000     40      90
1           2016-11-06 00:00:00.000     50      120
1           2016-12-06 00:00:00.000     60      150
2           2016-10-06 00:00:00.000     70      70
2           2016-11-06 00:00:00.000     80      150
2           2016-12-06 00:00:00.000     90      240

The logic here is that you are joining back to the same table with the date range as -3 months (for this example) to give your running total.

It's going to be easier doing this rather than trying to persist the data in a computed column (which would be my other approach) as you only have to run this once when you pull the data. The computed column will have to calculate every time you put any data into this table.

Rich Benner
  • 7,873
  • 9
  • 33
  • 39
  • Hello Rich, Thank you for your detailed response, I am going to play around with what you have done and I will get back to you. – Mark Blackburn Dec 06 '16 at 15:06