25

How is the derivative of a f(x) typically calculated programmatically to ensure maximum accuracy?

I am implementing the Newton-Raphson method, and it requires taking of the derivative of a function.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
vehomzzz
  • 42,832
  • 72
  • 186
  • 216
  • Show us what you've done so far. – Lazarus Oct 13 '09 at 11:42
  • 11
    do you want me to get fired :)? – vehomzzz Oct 13 '09 at 11:48
  • 5
    Newton's method accuracy doesn't depend (only) on the accuracy of the derivative, a gross approximation will do too (in the extreme case you get the secant method), but it will be a little bit slower to get the same precision (more iterations needed). – fortran Oct 13 '09 at 12:44

8 Answers8

65

I agree with @erikkallen that (f(x + h) - f(x - h)) / 2 * h is the usual approach for numerically approximating derivatives. However, getting the right step size h is a little subtle.

The approximation error in (f(x + h) - f(x - h)) / 2 * h decreases as h gets smaller, which says you should take h as small as possible. But as h gets smaller, the error from floating point subtraction increases since the numerator requires subtracting nearly equal numbers. If h is too small, you can loose a lot of precision in the subtraction. So in practice you have to pick a not-too-small value of h that minimizes the combination of approximation error and numerical error.

As a rule of thumb, you can try h = SQRT(DBL_EPSILON) where DBL_EPSILON is the smallest double precision number e such that 1 + e != 1 in machine precision. DBL_EPSILON is about 10^-15 so you could use h = 10^-7 or 10^-8.

For more details, see these notes on picking the step size for differential equations.

Regexident
  • 29,441
  • 10
  • 93
  • 100
John D. Cook
  • 29,517
  • 10
  • 67
  • 94
  • 4
    I think your rule of thumb assumes you use a first-order rule to approximate the derivative. However, the central difference rule you mention is second order, and the corresponding rule of thumb is h = EPSILON^(1/3) which is approximately 10^(-5) when using double precision. – Jitse Niesen Oct 13 '09 at 13:05
  • I think the accuracy can be improved a little by dividing by (x+h)-(x-h) instead of 2h. It's mathematically equivalent but not numerically. – sellibitze Oct 13 '09 at 13:10
  • 1
    Whould you mean instead "DBL_EPSILON is the smallest double precision number e such that **1 + e != 1** in machine precision." – yves Baumes Oct 13 '09 at 15:58
  • 2
    Choosing h depends on the derivative f'''(x). – Alexey Malistov Oct 13 '09 at 16:10
  • +1 great answer. I would surely have overlooked that very small sizes of h magnifies floating point errors. Thanks, I learned something today. – MAK Oct 13 '09 at 19:20
  • As a rule of thumb, take instead 'cubic root of the accuracy to which you know f' times the order of magnitude of f. Your order 2 finite difference has truncation + roundoff error on the order of accuracy(f)^2/3 – Alexandre C. Sep 07 '11 at 12:57
11

Newton_Raphson assumes that you can have two functions f(x) and its derivative f'(x). If you do not have the derivative available as a function and have to estimate the derivative from the original function then you should use another root finding algorithm.

Wikipedia root finding gives several suggestions as would any numerical analysis text.

mmmmmm
  • 32,227
  • 27
  • 88
  • 117
10

alt text

alt text

1) First case:

alt text

alt text — relative rounding error, about 2^{-16} for double and 2^{-7} for float.

We can calculate total error:

alt text

Suppose that you are using double floating operation. Thus the optimal value of h is 2sqrt(DBL_EPSILON/f''(x)). You do not know f''(x). But you have to estimate this value. For example, if f''(x) is about 1 then the optimal value of h is 2^{-7} but if f''(x) is about 10^6 then the optimal value of h is 2^{-10}!

2) Second case:

alt text

Note that second approximation error tends to 0 faster than first one. But if f'''(x) is very lagre then first option is more preferable:

alt text

Note that in the first case h is proportional to e but in the second case h is proportional to e^{1/3}. For double floating operations e^{1/3} is 2^{-5} or 2^{-6}. (I suppose that f'''(x) is about 1).


Which way is better? It is unkown if you do not know f''(x) and f'''(x) or you can not estimate these values. It is believed that the second option is preferable. But if you know that f'''(x) is very large use first one.

What is the optimal value of h? Suppose that f''(x) and f'''(x) are about 1. Also assume that we use double floating operations. Then in the first case h is about 2^{-8}, in the first case h is about 2^{-5}. Correct this values if you know f''(x) or f'''(x).

Spooky
  • 2,966
  • 8
  • 27
  • 41
Alexey Malistov
  • 26,407
  • 13
  • 68
  • 88
  • epsilon should be 2^-53 for double, and 2^-24 for float (which is about 10^-16 and 10^-7, respectively). – Stephen Canon Oct 13 '09 at 15:50
  • epsilon is **relative** rounding error (not absolute). It is always about 10^{-16} for double and 10^-7 for float – Alexey Malistov Oct 13 '09 at 16:00
  • Yes, I know. In your answer, you say "epsilon -- relative rounding error, about 2^{-16} for double and 2^{-7} for float," which is plainly incorrect. The relative (forward) rounding error is also **not** always on that scale, but rather the backwards error. The forward error can be much, much larger when cancellation occurs, as is likely to happen here. – Stephen Canon Oct 14 '09 at 00:44
  • The ImageShack images are currently broken. – kibibu Feb 03 '15 at 23:26
  • 1
    The evaluation error is more correctly a multiple of `abs(f(x))*eps`, where the multiplicity relates to the number of floating point operations in the evaluation of `f(x)`. Thus `h~cbrt(abs(f(x)/f'''(x))*eps)` for the central difference. – Lutz Lehmann Jun 20 '18 at 19:51
6
fprime(x) = (f(x+dx) - f(x-dx)) / (2*dx)

for some small dx.

erikkallen
  • 33,800
  • 13
  • 85
  • 120
  • 1
    Numerical Receipes has some comments on that http://books.google.co.uk/books?id=1aAOdzK3FegC&printsec=frontcover&dq=related:ISBN0521437202#v=onepage&q=newton-Raphson&f=false – mmmmmm Oct 13 '09 at 12:31
5

What do you know about f(x)? If you only have f as a black box the only thing you can do is to numerically approximate the derivative. But the accuracy is usually not that good.

You can do much better if you can touch the code that computes f. Try "automatic differentiation". There some nice libraries for that available. With a bit of library magic you can convert your function easily to something that computes the derivative automatically. For a simple C++ example, see the source code in this German discussion.

sellibitze
  • 27,611
  • 3
  • 75
  • 95
5

You definitely want to take into account John Cook's suggestion for picking h, but you typically don't want to use a centered difference to approximate the derivative. The main reason is that it costs an extra function evaluation, if you use a forward difference, ie,

f'(x) = (f(x+h) - f(x))/h

Then you'll get value of f(x) for free because you need to compute it already for Newton's method. This isn't such a big deal when you have a scalar equation, but if x is a vector, then f'(x) is a matrix (the Jacobian), and you'll need to do n extra function evaluations to approximate it using the centered difference approach.

MikeT
  • 1,624
  • 8
  • 12
3

In addition to John D. Cooks answer above it is important not only to take into account the floating point precision, but also the robustness of the function f(x). For instance, in finance, it is a common case that f(x) is actually a Monte Carlo-simulation and the value of f(x) has some noise. Using a very small step size can in these cases severely degrade the accuracy of the derivative.

Rickard
  • 1,754
  • 1
  • 12
  • 17
1

Typically signal noise impacts the derivative quality more that anything else. If you do have noise in your f(x), Savtizky-Golay is a excellent smoothing algorithm that is often used to compute good derivatives. In a nutshell, SG fits a polynomial locally to your data, the then this polynomial can be used to compute the derivative.

Paul

Paul
  • 5,376
  • 1
  • 20
  • 19