32

One example for curve is shown as below. The elbow point might be x=3 or 4. How to compute the elbow for a curve automatically and mathematically?

alt text

Jie
  • 387
  • 1
  • 4
  • 9
  • You may want to ask that here: http://math.stackexchange.com/. But in any case you need to provide some context on how the curve is produced and what possible shapes it can take. – TToni Dec 17 '10 at 15:33
  • 6
    possible duplicate of [finding the best trade-off point on a curve](http://stackoverflow.com/questions/2018178/finding-the-best-trade-off-point-on-a-curve) – Jacob Dec 17 '10 at 15:38
  • 1
    There is an excellent answer to this problem. Check out the link I've posted as a possible duplicate. – Jacob Dec 17 '10 at 15:39
  • The solutions for finding the best trade-off point on a curve (http://stackoverflow.com/questions/2018178/finding-the-best-trade-off-point-on-a-curve) is a good suggestion. However, this solution depends on the points on the curve. I take the suggestion of @ebo and @Chris Taylor by looking for the point with the maximum absolute second derivative which, for a set of discrete points x[i] as I have there, is approximated with a central difference: secondDerivative(i) = x(i+1) + x(i-1) - 2 * x(i); [max,idx] = max(secondDerivative); – Jie Dec 19 '10 at 00:03

4 Answers4

24

I created a Python package that attempts to implement the Kneedle algorithm.

To recreate the function above and detect the point of maximum curvature:

x = range(1,21)
y = [0.065, 0.039, 0.030, 0.024, 0.023, 0.022, 0.019, 0.0185, 0.0187,
     0.016, 0.015, 0.016, 0.0135, 0.0130, 0.0125, 0.0120, 0.0117, 0.0115, 0.0112, 0.013]

kn = KneeLocator(
    x,
    y,
    curve='convex',
    direction='decreasing',
    interp_method='interp1d',
)

print(kn.knee)
7
import matplotlib.pyplot as plt
plt.xlabel('x')
plt.ylabel('f(x)')
plt.xticks(range(1,21))
plt.plot(x, y, 'bx-')
plt.vlines(kn.knee, plt.ylim()[0], plt.ylim()[1], linestyles='dashed')

enter image description here

update
Kneed has an improved spline fitting method for handling local minima, use interp_method='polynomial'.

kn = KneeLocator(
    x,
    y,
    curve='convex',
    direction='decreasing',
    interp_method='polynomial',
)

print(kn.knee)
4

And the new plot:

plt.xlabel('x')
plt.ylabel('f(x)')
plt.xticks(range(1,21))
plt.plot(x, y, 'bx-')
plt.vlines(kn.knee, plt.ylim()[0], plt.ylim()[1], linestyles='dashed')

enter image description here

Kevin
  • 7,960
  • 5
  • 36
  • 57
22

You might want to look for the point with the maximum absolute second derivative which, for a set of discrete points x[i] as you have there, can be approximated with a central difference:

secondDerivative[i] = x[i+1] + x[i-1] - 2 * x[i]

As noted above, what you really want is the point with maximum curvature, but the second derivative will do, and this central difference is a good proxy for the second derivative.

Chris Taylor
  • 46,912
  • 15
  • 110
  • 154
  • 2
    The data provided is noisy. You'll need to be more careful than that. Your suggestion might identify, for example, x=12 or x=19. – Josephine Dec 18 '10 at 20:26
  • Thanks for your suggestions. I take your idea as my solutions. – Jie Dec 18 '10 at 23:51
  • 1
    Hi, Chris, You got me a good answer. Can you tell me if there is a reference for this solution? I want to draft a paper and so I need a reference for this solution. – Jie May 23 '11 at 07:49
  • 2
    Can you explain in what conditions this approximation is a decent estimate and in what conditions would be way off? – Omley Dec 29 '16 at 09:57
  • 3
    "what you really want is the point with maximum curvature, but the second derivative will do". The point about the curvature seems correct, but the second derivative will NOT ALWAYS do (and maybe will never do): think about function like exp(-x) -- it kind of has elbow, but its second derivative does not have anything event remotely resembling maximum where one would expect the location of the elbow. – Alex Fainshtein Oct 18 '17 at 04:42
  • Alex Fainshtein is right, it doesn't always work, especially for exp(-x)... The exact example in the question has fluctuations in the signal and the second derivative changes the sign... Answering Omley's question, this approximation is incorrect if the signal is not monotonic. – gevra Feb 16 '18 at 14:14
  • One might want to apply a smoothing algorithm first but this method has proven to be reliable in most cases with the data looks like the provided example. – GuillaumeL Jan 30 '19 at 15:36
4

Functions like this one are usually called L-curves for their shapes. They appear when solving ill-posed problems through regularization.

The 'elbow'-point is the point on the curve with the maximum absolute second derivative.

ebo
  • 8,985
  • 3
  • 31
  • 37
  • Yes, your ideas is the same as Chris Taylor. Thanks. – Jie Dec 18 '10 at 23:52
  • 1
    Another question is that why the 'elbow'-point is the point on the curve with the maximum absolute second derivative? – Jie Dec 19 '10 at 00:26
1

What you really want is the point with maximum curvature. When the slope is much smaller than 1, this can be approximated by the second derivative (as @ebo points out), but this is not always the case.

job
  • 9,003
  • 7
  • 41
  • 50