0

I want to interpolate missing values and update my table accordingly in SQL server 2012

For example my data is as follow:

Week_Number  Var1   Output_Var
1            10         10
2            20         20
3           NULL        22.5
4           NULL        25.0
5           NULL        27.5
7            30         30

The output of var1 should look like Output_Var variable.

  • Checkout: http://stackoverflow.com/questions/6569376/best-way-to-interpolate-values-in-sql?lq=1 – Hans Apr 15 '13 at 12:29

3 Answers3

4
declare @alo as table(x int, y float);
insert into @alo (x,y) values (1,10);
insert into @alo (x,y) values (2,20);
insert into @alo (x,y) values (3,null);
insert into @alo (x,y) values (4,null);
insert into @alo (x,y) values (5,null);
insert into @alo (x,y) values (6,30);

SELECT this.x as [X], isnull(this.y,
    (
        SELECT    CASE WHEN next.x IS NULL  THEN prev.y
                       WHEN prev.x IS NULL  THEN next.y
                       WHEN next.x = prev.x THEN prev.y
                       ELSE prev.y + ( (next.y - prev.y) * (this.x - prev.x) / (next.x - prev.x) )
                  END 
        FROM
            ( SELECT TOP 1 X, Y FROM @alo WHERE x <= this.x and y is not null ORDER BY x DESC ) AS prev
            CROSS JOIN
            ( SELECT TOP 1 X, Y FROM @alo WHERE x >= this.x and y is not null ORDER BY x ASC ) AS next
        )) as [Y]
FROM @alo this order by this.x ASC
Jon
  • 53
  • 5
0

You can estimate you missing values using the simple linear regression technique, see the "numerical example".

polkduran
  • 2,533
  • 24
  • 34
0

You could use something like that (from this):

declare @alo as table(x int, y float);
insert into @alo (x,y) values
(1,10),
(2,20),
(3,null),
(4,null),
(5,null),
(6,30)
;
declare @sumtable as table(sx int ,sy int ,sx2 int,sy2 int ,sxy int, n int );

insert into @sumtable
select 
SUM(d.x) as sx,
SUM(d.y) as sy,
SUM(d.x2) as sx2,
SUM(d.y2) as sy2,
SUM(d.xy) as sxy,
count(0) as n
from (

    select
    x, x*x as x2,
    y, y*y as y2,
    x*y as xy
    from @alo
    where x is not null and y is not null
)  D



declare @sx int = (select sx from @sumtable), 
        @sx2 int = (select sx2 from @sumtable),
        @sy int= (select sy from @sumtable), 
        @sy2 int= (select sy2 from @sumtable),
        @sxy int= (select sxy from @sumtable),
        @n int =  (select n from @sumtable);


declare @b as float = cast((@n*@sxy- @sx*@sy) as float)/ cast((@n*@sx2 - @sx*@sx) as float);
declare @a as float = (1.0/@n)*@sy - @b*(1.0/@n)*@sx;


update @alo 
set y = @b*x+@a
where y is null

select * from @alo
polkduran
  • 2,533
  • 24
  • 34
  • Thank you for the answer. But its seems to be incorrect as Null are replace by 38,50 and 62, but actually it should be replace by 22.5, 25 and 27.5 . – user2239073 Apr 16 '13 at 05:26
  • There were indeed two mistakes, one on the data, and the other on the linear function coefs. I inverted them, must read y = @b*x+@a, I edited the answer – polkduran Apr 16 '13 at 09:04
  • I also set @alo.y as float to have more accurate values, the result is : (1,10),(2,20),(3,19.99),(4,23.57),(5,27.14),(6,30) you will not have what you expected because this is an estimation method. If you remove the first value on your data (1,10), now you will have what you expected because you will interpolate between two values and not a linear regression estimation – polkduran Apr 16 '13 at 09:09