3

Abstract

There are times in life when your signals are noisy and you want to go from this

enter image description here

to this

enter image description here

There are many existing approaches that allow you to smooth data: local linear and polynomial regressions, different kinds of moving averages:

https://en.wikipedia.org/wiki/Kernel_smoother

https://homepages.inf.ed.ac.uk/rbf/HIPR2/gsmooth.htm

https://www.stat.wisc.edu/~mchung/softwares/hk/hk.html

https://matthew-brett.github.io/teaching/smoothing_intro.html

How to calculate a Gaussian kernel matrix efficiently in numpy?

How to smooth the blocks of a 3D voxel world?

However I've found that for my case the best suited approach is gaussian kernel smoothing. It worked like a charm for me on a desktop but when switched to mobile device the algorithm stalled the device due to high computational requirements.

One of the reasons to that was the fact that prior to smoothing I had to equalize the sampling rate with other signal, which heavily increased the sampling rate and thus the amount of data required for processing.

This also yielded to the fact that the size of the gaussian kernel had to be tens of thousands of components which directly influenced the complexity of the algorithm for my naive approach it requires tens of thousands of elements in the filter to be multiplied and summed with tens of thousands of components from signal to receive just one component of smoothed data.

There is way to do gaussian kernel smoothing efficiently that requires the use of Fourier Transform which is very complex to implement efficiently.

So instead of going along this path I've thought that you could actually do smoothing a lot more efficiently without diving into complexities of Fourier Transforms.

Basically any signal is just a collection of pairs [time, value] ordered by time. When interpolated using linear interpolation it can always be represented with Polygonal chain and you can actually smooth one component of Polygonal chain (i.e. Line Segment) analytically. I won't dive deeper into how you actually compute the convolutional integral of analytically defined kernel over analytically defined line segment. It's just worth saying here that the result is a huge piecewise function:

enter image description here

That yields to a number of if statements:

public static Double Solution(Double x, Double w, Double x1, Double y1, Double x2, Double y2)
{
    Double result;

    if (x + w < x2 && w + x1 <= x)
    {
        result = (x * y1 - x2 * y1 - x * y2 + x1 * y2) / (x1 - x2);
    }
    else if (x1 < x + w && x < x1 && x + w < x2)
    {
        result = Math.Pow(w + x - x1, 2) * ((w + x + 2 * x1 - (3 * x2)) * y1 - (w + x - x1) * y2) * Math.Pow(w, -2) / (x1 - x2) / 6;
    }
    else if (x == x1 && x + w < x2)
    {
        result = y1 / 2 + w * (y1 - y2) / (x1 - x2) / 6;
    }
    else if (x + w < x2 && x1 < x && x < w + x1)
    {
        result = Math.Pow(w, -2) / (x1 - x2) * (Math.Pow(w, 3) * (y1 - y2) + 3 * w * w * (-x2 * y1 + x * (y1 - y2) + x1 * y2) - Math.Pow(x - x1, 2) * ((x + 2 * x1 - (3 * x2)) * y1 + (-x + x1) * y2) + 3 * w * (x - x1) * ((x + x1 - (2 * x2)) * y1 + (-x + x1) * y2)) / 6;
    }
    else if (x2 <= x + w && x < x1)
    {
        result = (x1 - x2) * (2 * y1 * x1 + x2 * y1 + x1 * y2 + 2 * x2 * y2 - 3 * w * (y1 + y2) - 3 * x * (y1 + y2)) * Math.Pow(w, -2) / 6;
    }
    else if (x2 <= x && w + x1 <= x && x < w + x2)
    {
        result = -Math.Pow(w - x + x2, 2) * ((w - x + x2) * y1 + (-w + x - 3 * x1 + (2 * x2)) * y2) * Math.Pow(w, -2) / (x1 - x2) / 6;
    }
    else if (x2 <= x && x < w + x1)
    {
        result = -(x1 - x2) * (2 * y1 * x1 + x2 * y1 + x1 * y2 + 2 * x2 * y2 + 3 * w * (y1 + y2) - 3 * x * (y1 + y2)) * Math.Pow(w, -2) / 6;
    }
    else if (x == x1 && (x + w == x2 && (x2 <= x && x + w < x1 && w + x2 <= x || x < w + x1) || x < w + x1 && x2 <= x + w && x < x2))
    {
        result = (-x1 + x2) * (3 * w * (y1 + y2) + (x1 - x2) * (y1 + 2 * y2)) * Math.Pow(w, -2) / 6;
    }
    else if (x < x2 && x2 <= x + w && w + x1 <= x)
    {
        result = Math.Pow(w, -2) / (x1 - x2) * (Math.Pow(w, 3) * (-y1 + y2) + 3 * w * w * (-x2 * y1 + x * (y1 - y2) + x1 * y2) + Math.Pow(x - x2, 2) * ((-x + x2) * y1 + (x - 3 * x1 + (2 * x2)) * y2) + 3 * w * (x - x2) * (-2 * x1 * y2 + x * (-y1 + y2) + x2 * (y1 + y2))) / 6;
    }
    else if (x < x2 && x2 <= x + w && x1 < x && x < w + x1)
    {
        result = Math.Pow(w, -2) / (x1 - x2) * (-2 * Math.Pow(x1, 3) * y1 + 3 * x1 * x1 * x2 * y1 + Math.Pow(x2, 3) * y1 - Math.Pow(x1, 3) * y2 - 3 * x1 * (x2 * x2) * y2 + 2 * Math.Pow(x2, 3) * y2 + 2 * Math.Pow(x, 3) * (-y1 + y2) - 3 * w * Math.Pow(x1 - x2, 2) * (y1 + y2) + 6 * x * x * (x2 * y1 - x1 * y2) + 3 * x * (2 * x1 * x2 * (-y1 + y2) + x1 * x1 * (y1 + y2) - (x2 * x2) * (y1 + y2))) / 6;
    }
    else
    {
        result = 0.0e0;
    }

    return result;
}

And here's how it visually looks like: enter image description here

You can also combine multiple line segments via sum of smoothing functions for these segments

enter image description here

Thus reducing the computational complexity down to number of points that you have:

The problem

If statement above looks inefficient and there are numerical stability problems with it. When line segments defined by x1,y1,x2,y2 are getting big numbers the result explodes. I'm trying deal with these problems but it's harder than I expected. Even careful expansion of equations introduces some strange errors in to computation. So I thought it was a good idea to consult with StackOverflow to probably get more ideas on how to optimize this statement and make it more numerically stable.

Thank you in advance,

Community
  • 1
  • 1
Lu4
  • 14,873
  • 15
  • 79
  • 132
  • 4
    Does it need to be optimized? Sure, it's ugly to read, but I'd just add a comment saying something like "this was autogenerated with , don't touch" – AKX Feb 01 '19 at 10:56
  • 1
    What does the calculation represent? Why do you feel the need to simplify it? Would you be better to just add comments to it (and meaningful parameter names)? – mjwills Feb 01 '19 at 10:56
  • Almost certainly, you should simplify it using mathematical simplification of the current equation rather than trying to simplify the resulting code. – Damien_The_Unbeliever Feb 01 '19 at 10:57
  • Guys please see updates – Lu4 Feb 01 '19 at 11:03
  • 1
    Did you wrote tests for it? – Fabio Feb 01 '19 at 11:09
  • @Fabio this would be a question to a question ))) But no I haven't written tests for it, and more to that is that I don't have to do this because this is analytical solution generated through special mathematical software, it doesn't introduce errors... But there are edge case things that actually are ugly and wrong such as comparison of Double's – Lu4 Feb 01 '19 at 11:18
  • You could maybe change it to a switch/case statement, its not much cleaner but I personally find them easier to read – MikeS Feb 01 '19 at 12:17
  • @Lu4, my point was, with tests you can play around with different approaches and in seconds will get a feedback is your approach correct. – Fabio Feb 01 '19 at 19:10
  • Oh, now I get what you've meant, I need tests while I will be playing with this expression, good point thanks! – Lu4 Feb 01 '19 at 19:29
  • Can't you use the Kernel density estimation algorithms implemented by Math.NET Numerics ? https://numerics.mathdotnet.com/api/MathNet.Numerics.Statistics/KernelDensity.htm#GaussianKernel – Gimly Feb 15 '19 at 12:21
  • One option would be to extract each function into its own class with an IsApplicable method to check for the condition and a calculate method for the result. Put the extracted classes into a sorted list and iterate over them to get the results from the applicable functions. This way you could also test each function separately if desired. – Lennart Feb 15 '19 at 14:06
  • 1
    @Gimly I'm using triangle kernel density under the hood, I've tried gaussian, rectangular, cosine, sigmoid, and many custom variants but not all of them are usable for this task. It should be noted that the kernel density function alone won't help you. It won't smooth and interpolate your signal, but this function does. You just plug your segments into it and get smooth output at any point you wish. I just need to make it more robust and numerically stable... – Lu4 Feb 15 '19 at 14:08
  • Why not write a service and host it? Your mobile devices could request for the "smooth" data instead of doing all the heavy work itself. – j4rey Feb 15 '19 at 14:16
  • @Lennart good point, I'm using something similar to this in order to identify numerical stability problems right now – Lu4 Feb 15 '19 at 14:17
  • @j4rey devices can go offline at any point – Lu4 Feb 15 '19 at 14:17

0 Answers0