0

I thought of an algorithm for integral calculation that should be more accurate that the regular rectangle approach. My algorithm can be best described with a graphic (I am using f(x) = sin(x) as example): algorithm

  1. First the x and y positions of the points P1, P2, P3, P4 are calculated (red dots).

  2. The area of the green four sided figure is a part of the result. This area is calculated by dividing it into two triangles (blue line).

  3. The area of each triangle is calculated using the Heron’s formula.

I think it is obvious that this should lead to much better results than the rectangle approach.

In code this looks like this:

double integral(double f(double x), double min, double max) {
    Point p1, p2, p3, p4;
    double area = 0.0;
    double s1 = 0.0;
    double s2 = 0.0;


    for(double x = min; x < max; x += stepSize) {
        p1.x = x;
        p1.y = 0.0;

        p2.x = x;
        p2.y = f(x);

        p3.x = x + stepSize;
        p3.y = f(x + stepSize);

        p4.x = x + stepSize;
        p4.y = 0.0;


        s1 = 0.5 * (distance(p1, p2) + distance(p2, p4) + distance(p1, p4));
        s2 = 0.5 * (distance(p2, p3) + distance(p3, p4) + distance(p2, p4));

        area += sqrt(s1 * (s1 - distance(p1, p2)) * (s1 - distance(p2, p4)) * (s1 - distance(p1, p4)));
        area += sqrt(s2 * (s2 - distance(p2, p3)) * (s2 - distance(p3, p4)) * (s2 - distance(p2, p4)));
    }

    return area;
}

The distance function is just returning the distance between two points in 2D space. The Point struct is just holding a x and y coordinate. stepSize is a constant that I set to 0.001

My function is giving a result that is correct, but I wanted to know how much more precise it is compared to the rectangle approach.

On the internet I found this code that is calculating a integral using rectangles:

double integral2(double(*f)(double x), double a, double b, int n) {
    double step = (b - a) / n;  // width of each small rectangle
    double area = 0.0;  // signed area
    for (int i = 0; i < n; i ++) {
        area += f(a + (i + 0.5) * step) * step; // sum up each small rectangle
    }
    return area;
}

I both tested them using the standard math.h sin function from 0.0 to half π. This area should be 1.

My algorithm has given me the result 1.000204 for a step-size of 0.001.

The rectangle algorithm hast given me the result 1.000010 with a calculated step-size of 0.015708.

How can such a difference in accuracy and step-size be explained? Did I implement my algorithm wrong?

Update

Using the calculated step-size of the second method, I get the result 0.999983 which is much closer to one than the result with a step-size of 0.001.

Now how can that work??

user11914177
  • 885
  • 11
  • 33
  • Standard floating point rounding issues? See [Is floating point math broken?](https://stackoverflow.com/questions/588004/is-floating-point-math-broken) for details – Some programmer dude Oct 28 '19 at 11:31
  • @Someprogrammerdude I made sure all functions are using doubles only – user11914177 Oct 28 '19 at 11:32
  • `double` is still a floating point type, and as such will have rounding errors even if they will be smaller than for `float`. The more operations you do with your floating point values, the bigger the error will get (the rounding compounds). – Some programmer dude Oct 28 '19 at 11:34
  • 1
    I would have computed the area of the trapezoids without dividing them into triangles. This will reduce the number of floating point operations and with them (hopfully) the rounding error. – Johan Oct 28 '19 at 11:35
  • @Someprogrammerdude yes obviously, but the other function is calculating with much more accuracy with a smaller step-size. – user11914177 Oct 28 '19 at 11:36
  • @Johan that's a good idea, but I think this shouldn't explain this inaccuracy – user11914177 Oct 28 '19 at 11:38
  • 1
    you are using a first order method compared to zero oder rectangles. The main difference is that rectagles approximate the function with piecewise constant functions (0-th order) while you approximate the function with piecewise linear functions (1st odrder). Making the dx small enough they will both converge to the same result, but yours will converge faster – 463035818_is_not_an_ai Oct 28 '19 at 11:38
  • note that this isnt a spectacular discovery (I dont want to belittle your approach, it is a nice one!), but it is a well known fact that you can take this further by approximating the orgininal function with polynomials to get even faster convergence – 463035818_is_not_an_ai Oct 28 '19 at 11:39
  • @formerlyknownas_463035818 could you please explain that in an easier language? – user11914177 Oct 28 '19 at 11:40
  • 1
    i added this question to my favourites and hope i can find time to write an easy to understand answer, just dont have the time now – 463035818_is_not_an_ai Oct 28 '19 at 11:41

2 Answers2

3

Your last trapezoid may be too wide: x+stepSize may be above max if max-min isn't a multiple of stepSize. That's why in the rectangular summation code you included, rather than stepSize, they use n (the number of rectangles).

You compute the trapezoid in a complicated way. Note that its area is stepSize * (P2.y + P3.y)/2. This adds computation cost, but I guess is not the cause of the numerical error in your test integral.

Except for these issues, your method is otherwise equivalent to the trapezoid rule. https://en.wikipedia.org/wiki/Trapezoidal_rule

Here is Python code that approximates the integral in three different ways, using 100 rectangles. The three ways are trap_heron (your method, using Heron's rule), trap (trapezoid method), and rect (rectangular summation). Your question is C++, but the results should be the same.

import math

N = 100

def dist(a, b):
    dx = a[0] - b[0]
    dy = a[1] - b[1]
    return math.sqrt(dx*dx + dy*dy)

def trap_heron(f, min, max):
    area = 0.0
    for i in range(N):
        x0 = min + (max-min) * i/N
        x1 = min + (max-min) * (i+1)/N
        y0 = f(x0)
        y1 = f(x1)

        p1 = (x0, 0.0)
        p2 = (x0, y0)
        p3 = (x1, y1)
        p4 = (x1, 0.0)

        s1 = 0.5 * (dist(p1, p2) + dist(p2, p4) + dist(p1, p4))
        s2 = 0.5 * (dist(p2, p3) + dist(p3, p4) + dist(p2, p4))

        area += math.sqrt(s1 * (s1 - dist(p1, p2)) * (s1 - dist(p2, p4)) * (s1 - dist(p1, p4)))
        area += math.sqrt(s2 * (s2 - dist(p2, p3)) * (s2 - dist(p3, p4)) * (s2 - dist(p2, p4)))
    return area

def trap(f, min, max):
    area = 0.0
    for i in range(N):
        x0 = min + (max-min) * i/N
        x1 = min + (max-min) * (i+1)/N
        y0 = f(x0)
        y1 = f(x1)
        area += (x1-x0) * (y0+y1)/2

    return area


def rect(f, min, max):
    area = 0.0
    for i in range(N):
        y = f(min + (max-min)*(i+0.5)/N)
        area += (max-min)/N * y
    return area

print(trap(math.sin, 0, math.pi/2))
print(trap_heron(math.sin, 0, math.pi/2))
print(rect(math.sin, 0, math.pi/2))

The output is:

0.9999794382396076
0.9999794382396054
1.0000102809119051

Note that trap and trap_heron produce very nearly the same result.

In your comments, you have a result of 1.015686. The error is very close to stepSize * sin(pi/2), so I guess you've summed up one too many trapezoids.

Paul Hankin
  • 54,811
  • 11
  • 92
  • 118
  • Interesting idea, I now am setting step-size to be `(max - min) / rectangles`. But with 100 rectangles in both methods my new results are `1.015686` with my method and `1.000010` with the rectangles. How does this work? – user11914177 Oct 28 '19 at 12:24
  • Thanks, that one extra trapezoid was the problem – user11914177 Oct 28 '19 at 12:56
0

You can try Kahan summation to reduce the error but the precision issue is real. You are approximating the integral using a numerical method after all.

John Leidegren
  • 59,920
  • 20
  • 131
  • 152
  • What does numerical method mean? I haven't learned about integrals in school yet. – user11914177 Oct 28 '19 at 11:32
  • 1
    @user11914177 Simplified, simply see "numerical methods" as opposed to "analytically absolute correct methods". Essentially, it is a field of math that deals with approximating things rather than calculating them (while being able to quantify how close one is to the actual solution). – Aziuth Oct 28 '19 at 13:01