We just have few built-in functions in SQL Server to do sophisticated statistical analysis but I need to calculate multiple linear regression in TSQL.
Based on this post (Multiple Linear Regression function in SQL Server), I could be able to get Coefficients
for Intercept (Y)
, X1
and X2
.
What I need is p-value
for X1
and X2
and also R Square
Test data:
DECLARE @TestData TABLE (i INT IDENTITY(1, 1), X1 FLOAT, X2 FLOAT, y FLOAT)
INSERT @TestData
SELECT 0, 17, 210872.3034 UNION ALL
SELECT 0, 23, 191988.2299 UNION ALL
SELECT 0, 18, 204564.9455 UNION ALL
SELECT 0, 4, 189528.9212 UNION ALL
SELECT 0, 0, 200203.6364 UNION ALL
SELECT 11, 0, 218814.1701 UNION ALL
SELECT 5, 0, 220109.2129 UNION ALL
SELECT 2, 0, 214377.8534 UNION ALL
SELECT 1, 0, 204926.9208 UNION ALL
SELECT 0, 0, 202499.4065 UNION ALL
SELECT 0, 3, 196917.8182 UNION ALL
SELECT 0, 9, 202286.0012
Desired output:
R Square 0.4991599183412360
p-value X1 0.0264247876580807
p-value X2 0.7817597643898020
I have already been able to get following data from the above test data.
b Coefficients
----------------------------------
Intercept (Y) 202119.231151577
X1 C(H) 1992.8421941724
X2 C(C) -83.8561622730127
I know TSQL is not a good platform to obtain this but I need it to be done purely in TSQL.
I am aware of XLeratorDB Function Packages for SQL Server