Abstract
There are times in life when your signals are noisy and you want to go from this
to this
There are many existing approaches that allow you to smooth data: local linear and polynomial regressions, different kinds of moving averages:
https://en.wikipedia.org/wiki/Kernel_smoother
https://homepages.inf.ed.ac.uk/rbf/HIPR2/gsmooth.htm
https://www.stat.wisc.edu/~mchung/softwares/hk/hk.html
https://matthew-brett.github.io/teaching/smoothing_intro.html
How to calculate a Gaussian kernel matrix efficiently in numpy?
How to smooth the blocks of a 3D voxel world?
However I've found that for my case the best suited approach is gaussian kernel smoothing. It worked like a charm for me on a desktop but when switched to mobile device the algorithm stalled the device due to high computational requirements.
One of the reasons to that was the fact that prior to smoothing I had to equalize the sampling rate with other signal, which heavily increased the sampling rate and thus the amount of data required for processing.
This also yielded to the fact that the size of the gaussian kernel had to be tens of thousands of components which directly influenced the complexity of the algorithm for my naive approach it requires tens of thousands of elements in the filter to be multiplied and summed with tens of thousands of components from signal to receive just one component of smoothed data.
There is way to do gaussian kernel smoothing efficiently that requires the use of Fourier Transform which is very complex to implement efficiently.
So instead of going along this path I've thought that you could actually do smoothing a lot more efficiently without diving into complexities of Fourier Transforms.
Basically any signal is just a collection of pairs [time, value]
ordered by time. When interpolated using linear interpolation it can always be represented with Polygonal chain and you can actually smooth one component of Polygonal chain (i.e. Line Segment) analytically. I won't dive deeper into how you actually compute the convolutional integral of analytically defined kernel over analytically defined line segment. It's just worth saying here that the result is a huge piecewise function:
That yields to a number of if
statements:
public static Double Solution(Double x, Double w, Double x1, Double y1, Double x2, Double y2)
{
Double result;
if (x + w < x2 && w + x1 <= x)
{
result = (x * y1 - x2 * y1 - x * y2 + x1 * y2) / (x1 - x2);
}
else if (x1 < x + w && x < x1 && x + w < x2)
{
result = Math.Pow(w + x - x1, 2) * ((w + x + 2 * x1 - (3 * x2)) * y1 - (w + x - x1) * y2) * Math.Pow(w, -2) / (x1 - x2) / 6;
}
else if (x == x1 && x + w < x2)
{
result = y1 / 2 + w * (y1 - y2) / (x1 - x2) / 6;
}
else if (x + w < x2 && x1 < x && x < w + x1)
{
result = Math.Pow(w, -2) / (x1 - x2) * (Math.Pow(w, 3) * (y1 - y2) + 3 * w * w * (-x2 * y1 + x * (y1 - y2) + x1 * y2) - Math.Pow(x - x1, 2) * ((x + 2 * x1 - (3 * x2)) * y1 + (-x + x1) * y2) + 3 * w * (x - x1) * ((x + x1 - (2 * x2)) * y1 + (-x + x1) * y2)) / 6;
}
else if (x2 <= x + w && x < x1)
{
result = (x1 - x2) * (2 * y1 * x1 + x2 * y1 + x1 * y2 + 2 * x2 * y2 - 3 * w * (y1 + y2) - 3 * x * (y1 + y2)) * Math.Pow(w, -2) / 6;
}
else if (x2 <= x && w + x1 <= x && x < w + x2)
{
result = -Math.Pow(w - x + x2, 2) * ((w - x + x2) * y1 + (-w + x - 3 * x1 + (2 * x2)) * y2) * Math.Pow(w, -2) / (x1 - x2) / 6;
}
else if (x2 <= x && x < w + x1)
{
result = -(x1 - x2) * (2 * y1 * x1 + x2 * y1 + x1 * y2 + 2 * x2 * y2 + 3 * w * (y1 + y2) - 3 * x * (y1 + y2)) * Math.Pow(w, -2) / 6;
}
else if (x == x1 && (x + w == x2 && (x2 <= x && x + w < x1 && w + x2 <= x || x < w + x1) || x < w + x1 && x2 <= x + w && x < x2))
{
result = (-x1 + x2) * (3 * w * (y1 + y2) + (x1 - x2) * (y1 + 2 * y2)) * Math.Pow(w, -2) / 6;
}
else if (x < x2 && x2 <= x + w && w + x1 <= x)
{
result = Math.Pow(w, -2) / (x1 - x2) * (Math.Pow(w, 3) * (-y1 + y2) + 3 * w * w * (-x2 * y1 + x * (y1 - y2) + x1 * y2) + Math.Pow(x - x2, 2) * ((-x + x2) * y1 + (x - 3 * x1 + (2 * x2)) * y2) + 3 * w * (x - x2) * (-2 * x1 * y2 + x * (-y1 + y2) + x2 * (y1 + y2))) / 6;
}
else if (x < x2 && x2 <= x + w && x1 < x && x < w + x1)
{
result = Math.Pow(w, -2) / (x1 - x2) * (-2 * Math.Pow(x1, 3) * y1 + 3 * x1 * x1 * x2 * y1 + Math.Pow(x2, 3) * y1 - Math.Pow(x1, 3) * y2 - 3 * x1 * (x2 * x2) * y2 + 2 * Math.Pow(x2, 3) * y2 + 2 * Math.Pow(x, 3) * (-y1 + y2) - 3 * w * Math.Pow(x1 - x2, 2) * (y1 + y2) + 6 * x * x * (x2 * y1 - x1 * y2) + 3 * x * (2 * x1 * x2 * (-y1 + y2) + x1 * x1 * (y1 + y2) - (x2 * x2) * (y1 + y2))) / 6;
}
else
{
result = 0.0e0;
}
return result;
}
And here's how it visually looks like:
You can also combine multiple line segments via sum of smoothing functions for these segments
Thus reducing the computational complexity down to number of points that you have:
The problem
If statement above looks inefficient and there are numerical stability problems with it. When line segments defined by x1,y1,x2,y2
are getting big numbers the result explodes. I'm trying deal with these problems but it's harder than I expected. Even careful expansion of equations introduces some strange errors in to computation. So I thought it was a good idea to consult with StackOverflow to probably get more ideas on how to optimize this statement and make it more numerically stable.
Thank you in advance,