Building on Jason S's solution to find a weighted-average and reign back in S
's growth.
Using the M
finding algorithm given before along with the aggregate formulas for weighted average and population standard deviation:
Avg = Avg(W*X) / Avg(W)
StDev = sqrt(Avg(W*X*X) / Avg(W) - Avg*Avg)
rewrite the code to find the three running averages, then do then the aggregate calculations at the end
function GetPopulationStats{
<#
.SYNOPSIS
Calculate the average, variance, and standard deviation of a weighted data set
.DESCRIPTION
Uses the Knuth method for finding means adapted by Jason S, to calculate the
three averages required by the agregate statistical formulas
.LINK
https://stackoverflow.com/a/1346890/4496560
#>
param(
[decimal[]]$x #Data Points
,[decimal[]]$w #Weights
)
$N = $x.Length
[decimal]$AvgW = 0
[decimal]$AvgWX = 0
[decimal]$AvgWXX = 0
for($i=0; $i -lt $N; $i++){
$AvgW += ($w[$i] - $AvgW) / ($i+1)
$AvgWX += ($w[$i]*$x[$i] - $AvgWX) / ($i+1)
$AvgWXX += ($w[$i]*$x[$i]*$x[$i] - $AvgWXX) / ($i+1)
}
[ordered]@{
N = $N
Avg = ($avg = $AvgWX / $AvgW)
Var = ($var = ($AvgWXX / $AvgW) - ($Avg * $Avg))
StDev = SquareRoot $var
}
}
and then if your language is like Windows PowerShell you'll likely need a higher precision [math]::sqrt()
function
function SquareRoot([decimal]$a){
<#
.SYNOPSIS
Find the square-root of $a
.DESCRIPTION
Uses the McDougall-Wotherspoon variant of the Newton-Raphson method to
find the positive zero of:
f(x) = (x * x) - a
f'(x) = 2 * x
.NOTES
ToDo: using a fitted polynomial would likely run faster
#>
$BiCycleX = $PrevX = 0;
$x = $a/2 # guess
$y = ($x * $x) - $a
$xx = $x
$m = $x + $xx
$del = $x - $PrevX
if($del -lt 0){ $del = -$del }
$i = 0
while($del -gt 0 -and $x -ne $BiCycleX){
$BiCycleX = $PrevX;
$PrevX = $x;
$x = $x - ($y / $m)
$y = ($x * $x) - $a
$xx = $x - ($y / $m)
$m = $x + $xx
$del = $x - $PrevX
if($del -lt 0){ $del = -$del }
if(++$i -ge 50){
throw ("invariant sanity fail on step {0}:`r`n x_(n-1) = {1}`r`n x_n = {2}`r`n delta = {3:0.#e0}" -f $i,$PrevX,$x,$del)
}
}
($x + $PrevX) / 2
}
however, if you don't need a weighted solution it should be easy enough to just let w[i]=1
for all i
finally, it doesn't hurt to do a quick sanity check on the code
describe 'tool spot-check' {
context 'testing root calcs' {
$TestCases = @(
@{Value = 0; Expected = 0}
@{Value = 1; Expected = 1}
@{Value = 4; Expected = 2}
@{Value = 9; Expected = 3}
@{Value = 2; Expected = [decimal]'1.4142135623730950488016887242'}
@{Value = (1e14-1); Expected = [decimal]'9999999.99999995'}
)
It 'finds the square root of: <Value>' -TestCases $TestCases {
param($Value,$Expected)
SquareRoot $Value | should be $Expected
}
}
context 'testing stat calcs' {
It 'calculates the values for 1 to 1000' {
$x = 1..1000
$w = @(1) * 1000
$results = GetPopulationStats $x $w
$results.N | should be 1000
$results.Avg | should be 500.5
$results.Var | should be 83333.25
$results.StDev | should be ([decimal]'288.67499025720950043826670416')
}
It 'calculates the values for a test data set' {
$x = @(33,119,37,90,50,94,32,147,86,28,50,80,145,131,121,90,140,170,214,70,124)
$w = @(207,139,25,144,72,162,93,91,109,151,125,87,49,99,210,105,99,169,50,59,22)
$results = GetPopulationStats $x $w
$results.N | should be 21
$results.Avg | should be ([decimal]'94.54433171592412880458756066')
$results.Var | should be ([decimal]'2202.659150711314347179152603')
$results.StDev | should be ([decimal]'46.93249567955356821948311637')
}
}
}