6

[What I want] is to find the only one smallest positive real root of quartic function ax^4 + bx^3 + cx^2 + dx + e

[Existing Method] My equation is for collision prediction, the maximum degree is quartic function as f(x) = ax^4 + bx^3 + cx^2 + dx + e and a,b,c,d,e coef can be positive/negative/zero (real float value). So my function f(x) can be quartic, cubic, or quadratic depending on a, b, c ,d ,e input coefficient.

Currently, I use NumPy to find roots as below.

import numpy

root_output = numpy.roots([a, b, c ,d ,e])

The "root_output" from the NumPy module can be all possible real/complex roots depending on the input coefficient. So I have to look at "root_output" one by one, and check which root is the smallest real positive value (root>0?)

[The Problem] My program needs to execute numpy.roots([a, b, c, d, e]) many times, so many times of executing numpy.roots is too slow for my project. and (a, b, c ,d ,e) value is always changed every time when executing numpy.roots

My attempt is to run the code on Raspberry Pi2. Below is an example of processing time.

  • Running many many times of numpy.roots on PC: 1.2 seconds
  • Running many many times of numpy.roots on Raspberry Pi2: 17 seconds

Could you please guide me on how to find the smallest positive real root in the fastest solution? Using scipy.optimize or implement some algorithm to speed up finding root or any advice from you will be great.

Thank you.

[Solution]

  • Quadratic function only need real positive roots (please be aware of division by zero)
def SolvQuadratic(a, b ,c):
    d = (b**2) - (4*a*c)
    if d < 0:
        return []

    if d > 0:
        square_root_d = math.sqrt(d)
        t1 = (-b + square_root_d) / (2 * a)
        t2 = (-b - square_root_d) / (2 * a)
        if t1 > 0:
            if t2 > 0:
                if t1 < t2:
                    return [t1, t2]
                return [t2, t1]
            return [t1]
        elif t2 > 0:
            return [t2]
        else:
            return []
    else:
        t = -b / (2*a)
        if t > 0:
            return [t]
        return []
  • Quartic Function for quartic function, you can use pure python/numba version as the below answer from @B.M.. I also add another cython version from @B.M's code. You can use the below code as .pyx file and then compile it to get about 2x faster than pure python (please be aware of rounding issues).
import cmath

cdef extern from "complex.h":
    double complex cexp(double complex)

cdef double complex  J=cexp(2j*cmath.pi/3)
cdef double complex  Jc=1/J

cdef Cardano(double a, double b, double c, double d):
    cdef double z0
    cdef double a2, b2
    cdef double p ,q, D
    cdef double complex r
    cdef double complex u, v, w
    cdef double w0, w1, w2
    cdef double complex r1, r2, r3


    z0=b/3/a
    a2,b2 = a*a,b*b
    p=-b2/3/a2 +c/a
    q=(b/27*(2*b2/a2-9*c/a)+d)/a
    D=-4*p*p*p-27*q*q
    r=cmath.sqrt(-D/27+0j)
    u=((-q-r)/2)**0.33333333333333333333333
    v=((-q+r)/2)**0.33333333333333333333333
    w=u*v
    w0=abs(w+p/3)
    w1=abs(w*J+p/3)
    w2=abs(w*Jc+p/3)
    if w0<w1:
      if w2<w0 : v = v*Jc
    elif w2<w1 : v = v*Jc
    else: v = v*J
    r1 = u+v-z0
    r2 = u*J+v*Jc-z0
    r3 = u*Jc+v*J-z0
    return r1, r2, r3

cdef Roots_2(double a, double complex b, double complex c):
    cdef double complex bp
    cdef double complex delta
    cdef double complex r1, r2


    bp=b/2
    delta=bp*bp-a*c
    r1=(-bp-delta**.5)/a
    r2=-r1-b/a
    return r1, r2

def SolveQuartic(double a, double b, double c, double d, double e):
    "Ferrarai's Method"
    "resolution of P=ax^4+bx^3+cx^2+dx+e=0, coeffs reals"
    "First shift : x= z-b/4/a  =>  P=z^4+pz^2+qz+r"
    cdef double z0
    cdef double a2, b2, c2, d2
    cdef double p, q, r
    cdef double A, B, C, D
    cdef double complex y0, y1, y2
    cdef double complex a0, b0
    cdef double complex r0, r1, r2, r3


    z0=b/4.0/a
    a2,b2,c2,d2 = a*a,b*b,c*c,d*d
    p = -3.0*b2/(8*a2)+c/a
    q = b*b2/8.0/a/a2 - 1.0/2*b*c/a2 + d/a
    r = -3.0/256*b2*b2/a2/a2 + c*b2/a2/a/16 - b*d/a2/4+e/a
    "Second find y so P2=Ay^3+By^2+Cy+D=0"
    A=8.0
    B=-4*p
    C=-8*r
    D=4*r*p-q*q
    y0,y1,y2=Cardano(A,B,C,D)
    if abs(y1.imag)<abs(y0.imag): y0=y1
    if abs(y2.imag)<abs(y0.imag): y0=y2
    a0=(-p+2*y0)**.5
    if a0==0 : b0=y0**2-r
    else : b0=-q/2/a0
    r0,r1=Roots_2(1,a0,y0+b0)
    r2,r3=Roots_2(1,-a0,y0-b0)
    return (r0-z0,r1-z0,r2-z0,r3-z0)

[Problem of Ferrari's method] We're facing the problem when the coefficients of quartic equation is [0.00614656, -0.0933333333333, 0.527664995846, -1.31617928376, 1.21906444869] the output from numpy.roots and ferrari methods is entirely different (numpy.roots is correct output).

import numpy as np
import cmath


J=cmath.exp(2j*cmath.pi/3)
Jc=1/J

def ferrari(a,b,c,d,e):
    "Ferrarai's Method"
    "resolution of P=ax^4+bx^3+cx^2+dx+e=0, coeffs reals"
    "First shift : x= z-b/4/a  =>  P=z^4+pz^2+qz+r"
    z0=b/4/a
    a2,b2,c2,d2 = a*a,b*b,c*c,d*d
    p = -3*b2/(8*a2)+c/a
    q = b*b2/8/a/a2 - 1/2*b*c/a2 + d/a
    r = -3/256*b2*b2/a2/a2 +c*b2/a2/a/16-b*d/a2/4+e/a
    "Second find y so P2=Ay^3+By^2+Cy+D=0"
    A=8
    B=-4*p
    C=-8*r
    D=4*r*p-q*q
    y0,y1,y2=Cardano(A,B,C,D)
    if abs(y1.imag)<abs(y0.imag): y0=y1
    if abs(y2.imag)<abs(y0.imag): y0=y2
    a0=(-p+2*y0)**.5
    if a0==0 : b0=y0**2-r
    else : b0=-q/2/a0
    r0,r1=Roots_2(1,a0,y0+b0)
    r2,r3=Roots_2(1,-a0,y0-b0)
    return (r0-z0,r1-z0,r2-z0,r3-z0)

#~ @jit(nopython=True)
def Cardano(a,b,c,d):
    z0=b/3/a
    a2,b2 = a*a,b*b
    p=-b2/3/a2 +c/a
    q=(b/27*(2*b2/a2-9*c/a)+d)/a
    D=-4*p*p*p-27*q*q
    r=cmath.sqrt(-D/27+0j)
    u=((-q-r)/2)**0.33333333333333333333333
    v=((-q+r)/2)**0.33333333333333333333333
    w=u*v
    w0=abs(w+p/3)
    w1=abs(w*J+p/3)
    w2=abs(w*Jc+p/3)
    if w0<w1:
      if w2<w0 : v*=Jc
    elif w2<w1 : v*=Jc
    else: v*=J
    return u+v-z0, u*J+v*Jc-z0, u*Jc+v*J-z0

#~ @jit(nopython=True)
def Roots_2(a,b,c):
    bp=b/2
    delta=bp*bp-a*c
    r1=(-bp-delta**.5)/a
    r2=-r1-b/a
    return r1,r2

coef = [0.00614656, -0.0933333333333, 0.527664995846, -1.31617928376, 1.21906444869]
print("Coefficient A, B, C, D, E", coef) 
print("") 
print("numpy roots: ", np.roots(coef)) 
print("") 
print("ferrari python ", ferrari(*coef))
Flair
  • 2,609
  • 1
  • 29
  • 41
Utthawut
  • 63
  • 1
  • 6
  • could you please clarify what did you mean with that: "and many numpy.roots/one loop is too slow for my project"? It's not quite clear – MaxU - stand with Ukraine Mar 04 '16 at 12:29
  • I actually mean numpy.roots is too slow, I need to know the faster way to find the root. Because numpy.roots find all possible real/complex roots. But I just want to know the only one positive real root output. So I think that maybe someone know how to find it faster than numpy.roots. – Utthawut Mar 04 '16 at 12:34
  • unfortunately i can't help you with scipy/numpy, but i can recommend you to cache your results - i.e. to make sure that you will never execute numpy.roots() with the same set of (a,b,c,d,e). I don't know though - maybe you are already doing that... – MaxU - stand with Ukraine Mar 04 '16 at 12:42
  • beside that you may want to ask here: http://math.stackexchange.com/ as well... – MaxU - stand with Ukraine Mar 04 '16 at 12:43
  • Thank you @MaxU. the (a,b,c,d,e) is always changed. It has very low possibility to be the same value. But I will add your condition to my topic, to let another guy know more scope of my problem. I'm very new to this community. How can I tag this topic to [link]math.stackexchange.com or I should create the new one to [link]math.stackexchange.com? Thank you – Utthawut Mar 04 '16 at 12:49
  • I think you should create open a new topic there, but i wouldn't ask there about the python/scipy/numpy solution. You may ask about math or algorithm solutions there... – MaxU - stand with Ukraine Mar 04 '16 at 13:26

4 Answers4

6

An other answer :

do it with analytic methods (Ferrari,Cardan), and speed the code with Just in Time compilation (Numba) :

Let see the improvement first :

In [2]: P=poly1d([1,2,3,4],True)

In [3]: roots(P)
Out[3]: array([ 4.,  3.,  2.,  1.])

In [4]: %timeit roots(P)
1000 loops, best of 3: 465 µs per loop

In [5]: ferrari(*P.coeffs)
Out[5]: ((1+0j), (2-0j), (3+0j), (4-0j))

In [5]: %timeit ferrari(*P.coeffs) #pure python without jit
10000 loops, best of 3: 116 µs per loop    
In [6]: %timeit ferrari(*P.coeffs)  # with numba.jit
100000 loops, best of 3: 13 µs per loop

Then the ugly code :

for order 4 :

@jit(nopython=True)
def ferrari(a,b,c,d,e):
    "resolution of P=ax^4+bx^3+cx^2+dx+e=0"
    "CN all coeffs real."
    "First shift : x= z-b/4/a  =>  P=z^4+pz^2+qz+r"
    z0=b/4/a
    a2,b2,c2,d2 = a*a,b*b,c*c,d*d 
    p = -3*b2/(8*a2)+c/a
    q = b*b2/8/a/a2 - 1/2*b*c/a2 + d/a
    r = -3/256*b2*b2/a2/a2 +c*b2/a2/a/16-b*d/a2/4+e/a
    "Second find X so P2=AX^3+BX^2+C^X+D=0"
    A=8
    B=-4*p
    C=-8*r
    D=4*r*p-q*q
    y0,y1,y2=cardan(A,B,C,D)
    if abs(y1.imag)<abs(y0.imag): y0=y1 
    if abs(y2.imag)<abs(y0.imag): y0=y2 
    a0=(-p+2*y0.real)**.5
    if a0==0 : b0=y0**2-r
    else : b0=-q/2/a0
    r0,r1=roots2(1,a0,y0+b0)
    r2,r3=roots2(1,-a0,y0-b0)
    return (r0-z0,r1-z0,r2-z0,r3-z0) 

for order 3 :

J=exp(2j*pi/3)
Jc=1/J

@jit(nopython=True) 
def cardan(a,b,c,d):
    u=empty(2,complex128)
    z0=b/3/a
    a2,b2 = a*a,b*b    
    p=-b2/3/a2 +c/a
    q=(b/27*(2*b2/a2-9*c/a)+d)/a
    D=-4*p*p*p-27*q*q
    r=sqrt(-D/27+0j)        
    u=((-q-r)/2)**0.33333333333333333333333
    v=((-q+r)/2)**0.33333333333333333333333
    w=u*v
    w0=abs(w+p/3)
    w1=abs(w*J+p/3)
    w2=abs(w*Jc+p/3)
    if w0<w1: 
        if w2<w0 : v*=Jc
    elif w2<w1 : v*=Jc
    else: v*=J        
    return u+v-z0, u*J+v*Jc-z0,u*Jc+v*J-z0

for order 2:

@jit(nopython=True)
def roots2(a,b,c):
    bp=b/2    
    delta=bp*bp-a*c
    u1=(-bp-delta**.5)/a
    u2=-u1-b/a
    return u1,u2  

Probably needs to be test furthermore, but efficient.

B. M.
  • 18,243
  • 2
  • 35
  • 54
  • That's great! I will test your analytic methods and then tell you the result, maybe with cython because I used to implement code in cython. – Utthawut Mar 07 '16 at 04:58
  • I have many questions, 1. What is trade off between your new analytic method and eigenvalues with companion matrix (numpy) as your previous reply. 2. You said that my factor is expecting positive real roots from the equation cannot speed up finding root process. But I actually expect the root output only in the range of 0 < x ≤ 7, any root outside this range is discarded. Can the additional factor speed up finding root with scipy.optimize.brentq or anything? Please point me to the good direction with your explanation or useful document. Thank you. – Utthawut Mar 07 '16 at 04:59
  • 1. eigenvalues computations is very costly, using 4x4 matrix. 2. brentq is by approximation an can be long too. analytic method are straightforward, only useful computations. too find the best algorithm, a more precise presentation of your problem is needed. How do you obtain your quadric equation ? – B. M. Mar 07 '16 at 07:12
  • For example; My quartic equation is obtained by two balls are moving in non-linear (accelerate) and I will make equation for collision prediction. So it will generate coef for ax^4 + bx^3 + cx^2 + dx +e. After finding root process, I have to decide which root is the smallest positive real root that will be the possible time which two balls will collide. – Utthawut Mar 07 '16 at 11:43
  • After testing your new analytic method, the most of testing is correct. But I found some test case for example; when P = np.poly1d([**0.1**, + 0.222210408124 + 1.29967219908j, 0.222210408124 - 1.29967219908j, **1.68200392658**],True) -> your analytic method will generate output (0.22221040812399995-1.29967219908j), (0.22221040812400006+1.29967219908j), **(0.0999999999999997+1.4984455319399336e-16j)**, **(1.68200392658-6.6842381885954808e-17j)** It included imaginary part which I don't expect. Should the imag part be checked with predefined epsilon value? Or you have any idea? Thank you – Utthawut Mar 07 '16 at 11:55
  • of course there are rounding issues, use round(x.imag) to decide. If you have two balls, I thinks is possible to have two quadratic equation of 2 variable, which can be very simple to solve. – B. M. Mar 07 '16 at 16:42
  • To reduce quartic to quadratic of 2 variables. My equation must be ax4+bx2+c? But my quartic equation has odd degree ax4 + **bx3** + cx2 + **dx** + e. – Utthawut Mar 08 '16 at 10:37
  • No, what I hope is to go back to the initial problem. For one ball, the equation is more simple. perhaps it s not necessary to combine the two movements immediately, and precompute some things before to simplify the problem – B. M. Mar 08 '16 at 14:00
  • Thank you for your advice. I also add cython version to the head of topic. – Utthawut Mar 09 '16 at 09:00
  • Dear @B. M. After testing your ferrari for python implementation, I found some bug when coefficient is [0.00614656, -0.0933333333333, 0.527664995846, -1.31617928376, 1.21906444869]. The output from numpy.roots is entirely difference from ferrari method as the code below latter comments – Utthawut Apr 10 '16 at 06:39
  • 'import numpy as np import cmath J=cmath.exp(2j*cmath.pi/3) Jc=1/J def ferrari(a,b,c,d,e): "Ferrarai's Method" "resolution of P=ax^4+bx^3+cx^2+dx+e=0, coeffs reals" "First shift : x= z-b/4/a => P=z^4+pz^2+qz+r" z0=b/4/a a2,b2,c2,d2 = a*a,b*b,c*c,d*d p = -3*b2/(8*a2)+c/a q = b*b2/8/a/a2 - 1/2*b*c/a2 + d/a r = -3/256*b2*b2/a2/a2 +c*b2/a2/a/16-b*d/a2/4+e/a "Second find y so P2=Ay^3+By^2+Cy+D=0" A=8 B=-4*p C=-8*r D=4*r*p-q*q y0,y1,y2=Cardano(A,B,C,D) if abs(y1.imag) – Utthawut Apr 10 '16 at 06:40
  • a0=(-p+2*y0)**.5 if a0==0 : b0=y0**2-r else : b0=-q/2/a0 r0,r1=Roots_2(1,a0,y0+b0) r2,r3=Roots_2(1,-a0,y0-b0) return (r0-z0,r1-z0,r2-z0,r3-z0) #~ @jit(nopython=True) def Cardano(a,b,c,d): z0=b/3/a a2,b2 = a*a,b*b p=-b2/3/a2 +c/a q=(b/27*(2*b2/a2-9*c/a)+d)/a D=-4*p*p*p-27*q*q r=cmath.sqrt(-D/27+0j) u=((-q-r)/2)**0.33333333333333333333333 v=((-q+r)/2)**0.33333333333333333333333 w=u*v w0=abs(w+p/3) w1=abs(w*J+p/3) w2=abs(w*Jc+p/3) if w0 – Utthawut Apr 10 '16 at 06:41
  • ' return u+v-z0, u*J+v*Jc-z0, u*Jc+v*J-z0 #~ @jit(nopython=True) def Roots_2(a,b,c): bp=b/2 delta=bp*bp-a*c r1=(-bp-delta**.5)/a r2=-r1-b/a return r1,r2 coef = [0.00614656, -0.0933333333333, 0.527664995846, -1.31617928376, 1.21906444869] print("Coefficient A, B, C, D, E", coef) print("") print("numpy roots: ", np.roots(coef)) print("") print("ferrari python ", ferrari(*coef)) ' – Utthawut Apr 10 '16 at 06:42
  • Could you please help me find the problem? – Utthawut Apr 12 '16 at 04:15
  • 1
    Hi, I update the program from my computer, I have the same results now. Don't know where was the bug. – B. M. Apr 15 '16 at 16:22
  • Hi @B.M. I find the problem on comparing the imag value as following code if abs(y1.imag) – Utthawut Apr 21 '16 at 11:59
4

the numpy solution to do that without loop is :

p=array([a,b,c,d,e])
r=roots(p)
r[(r.imag==0) & (r.real>=0) ].real.min()

scipy.optimize methods will be slower, unless you don't need precision:

In [586]: %timeit r=roots(p);r[(r.imag==0) & (r.real>=0) ].real.min()
1000 loops, best of 3: 334 µs per loop

In [587]: %timeit newton(poly1d(p),10,tol=1e-8)
1000 loops, best of 3: 555 µs per loop

In [588]: %timeit newton(poly1d(p),10,tol=1)
10000 loops, best of 3: 177 µs per loop

And you have then to find the min...

EDIT

for a 2x factor, do by yourself what roots does:

In [638]: b=zeros((4,4),float);b[1:,:-1]=eye(3)

In [639]: c=b.copy();c[0]=-(p/p[0])[1:];eig(c)[0]
Out[639]: 
array([-7.40849430+0.j        ,  5.77969794+0.j        ,
       -0.18560182+3.48995646j, -0.18560182-3.48995646j])

In [640]: roots(p)
Out[640]: 
array([-7.40849430+0.j        ,  5.77969794+0.j        ,
       -0.18560182+3.48995646j, -0.18560182-3.48995646j])

In [641]: %timeit roots(p)
1000 loops, best of 3: 365 µs per loop

In [642]: %timeit c=b.copy();c[0]=-(p/p[0])[1:];eig(c)
10000 loops, best of 3: 181 µs per loop
B. M.
  • 18,243
  • 2
  • 35
  • 54
  • Thank you. But the cpu-intensive is inside np.roots function. – Utthawut Mar 04 '16 at 13:07
  • For running on my PC µs time is acceptable. many many times of executing numpy.roots take time around 1.2 second on my PC. But running the same code on my Raspberry Pi2 take 17 seconds (14x slower). So I'm finding the solution for extracting only real positive root which is faster than numpy.roots. – Utthawut Mar 05 '16 at 05:50
  • I also get an error when running your 2x factor code. `from numpy import zeros, eye, copy, roots; from numpy.linalg import eig; example_equation = "3x^4 + 6x^3 - 123x^2 -126x +1080 = 0"; p = [3, 6, -123, -126, 1080]; roots(p); #array([-6., -4., 5., 3.]); b = zeros((4, 4), float); b[1:,:-1] = eye(3); c = b.copy(); c[0] = -(p/p[0])[1:];` **Traceback (most recent call last): File "", line 1, in c[0] = -(p/p[0])[1:] TypeError: unsupported operand type(s) for /: 'list' and 'int'** – Utthawut Mar 05 '16 at 06:08
  • Sorry : I made `p=array(p)` before for handy numpy use. IMHO, finding the roots can't be numerically improve by such a factor. Hope is to refactor the initial problem. A couple of quadratic equations can be easier for example. expose (in an other post ?) it . – B. M. Mar 05 '16 at 07:39
  • For now, your manual calculation (companion matrix) can speed up my program up to 15% for quartic function. In case of 2 degree quadratic (ax^2+bx+c), do you have idea to speed up? If I use companion matrix to speed up quadratic function, could you please give me an example? Thank you. – Utthawut Mar 06 '16 at 07:24
  • I am working on a solution via analytic method,via ferrari and cardano formula(http://stackoverflow.com/questions/35820676/fail-to-implement-cardano-method-cube-root-of-a-complex-number). I think I can win a 10x factor or more with numba. for 2, you probably learn the method at school ;) . – B. M. Mar 06 '16 at 07:50
  • OK, I will try to check discriminant before taking to quadratic formula (square root), and compare time between quadratic formula and numpy.roots. – Utthawut Mar 06 '16 at 10:08
3

If polynomial coefficients are known ahead of time, you can speed up by vectorizing the computation in roots (given Numpy >= 1.10 or so):

import numpy as np

def roots_vec(p):
    p = np.atleast_1d(p)
    n = p.shape[-1]
    A = np.zeros(p.shape[:1] + (n-1, n-1), float)
    A[...,1:,:-1] = np.eye(n-2)
    A[...,0,:] = -p[...,1:]/p[...,None,0]
    return np.linalg.eigvals(A)

def roots_loop(p):
    r = []
    for pp in p:
        r.append(np.roots(pp))
    return r

p = np.random.rand(2000, 4)  # 2000 polynomials of 4th order

assert np.allclose(roots_vec(p), roots_loop(p))

In [35]: %timeit roots_vec(p)
100 loops, best of 3: 4.49 ms per loop

In [36]: %timeit roots_loop(p)
10 loops, best of 3: 81.9 ms per loop

It might not beat analytical solution + Numba, but allows higher orders.

pv.
  • 33,875
  • 8
  • 55
  • 49
  • My program is consequent of polynomial coefficient (I will know the next coef after finding current root), I'm afraid. Anyway thank you for your answer. – Utthawut Mar 09 '16 at 09:13
0

In SymPy the real_roots function will return the sorted roots of a polynomial and you can loop through them and break on the first positive one:

for do in range(100):
  p = Poly.from_list([randint(-100,100) for i in range(5)], x)
  for i in real_roots(p):
    if i.is_positive:
      print(i)
      break
  else:
   print('no positive root', p)

So a function for this might be

def small_pos_root(a,b,c,d,e):
    from sympy.abc import x
    from sympy import Poly, real_roots
    for i in real_roots(Poly.from_list([a,b,c,d,e], x)):
        if i.is_positive: return i.n()

This will return None if there is no positive real root, otherwise it will return a numerical value of the first positive root.

smichr
  • 16,948
  • 2
  • 27
  • 34