curve fit of array with large values

Question

I am trying to do some algorithmic analysis on a factorial function with increasing n values, shown in the code

import time
import matplotlib.pyplot as plt
import statistics

def iterFactorial(n):
    factorial = 1
        
    for i in range(1,n+1):
        factorial *= i
    return factorial
        
running_times = []
n_values = []
n_value_factorials = []
        
def algo_1_Time_Analysis():
    temp_time = [0]
                
    for n in range(1,2000,100):
        n_values.append(n)
        avg_run_time = statistics.mean(temp_time)
        running_times.append(avg_run_time)
        temp_time = [0]
        n_value_factorials.append(iterFactorial(n)) 
            
        for j in range(100):
            start_time = time.time()
            iterFactorial(n)
            end_time = time.time()
            function_time = end_time - start_time         
            temp_time.append(function_time)

Essentially I am trying to get an average running time from increasing n value factorials, then plot them and use curve_fit to generate the best fit model

The plot looks like this

![running time plot][1]

from scipy.optimize import curve_fit

def linear(x,a,b):
    return a*x + b

constants = curve_fit(linear,n_value_factorials,running_times)
a_fit = constants[0][0]
b_fit = constants[0][1]
fit = []

for i in n_value_factorials:
    fit.append(linear(i,a_fit,b_fit))

plt.plot(n_value_factorials,n_values)
plt.plot(n_value_factorials,fit)
plt.show()

I'm running into an error however when running the curve_fit function - I get the error: "OverflowError: int too large to convert to float"

Does anyone know if there's a fix and/or some of the theories of why I am getting this error?

EDIT: Thanks for the responses. I realised that I was using the wrong values in my curve_fit function - The original goal was to assess the best fit for the running times as a function of increasing n values in an interative factorial function. My error was that I was using an array that contained the factorials of n (i.e. return value from factorial function) instead of the n-values that were input in. Silly error really!

here is the corrected code with the plot output

from scipy.optimize import curve_fit
def linear(x,a,b):
    return a*x + b
constants = curve_fit(linear,n_values,running_times)
a_fit = constants[0][0]
b_fit = constants[0][1]
fit = []
for i in n_values:
    fit.append(linear(i,a_fit,b_fit))
plt.plot(n_values,running_times)
plt.plot(n_values,fit)
plt.show()

I realise that a linear model may not be best fit and may have to play around with it. Thanks for the responses! [1]: https://i.stack.imgur.com/5M3jJ.png

you need to use **bigintegers** for the computation of factorial. If you do not it will overflow even for small `n!` if you just not store the result to avoid overflow then it will not match the correct runtime nor complexity ... take a look at this [Fast exact bigint factorial](https://stackoverflow.com/a/18333853/2521214) its my approach for n! with some example values and times (that you can cross check with yours) — Spektre, Mar 01 '22 at 08:16
as you can see `128!=385620482362580421735677065923463640617493109590223590278828403276373402575165543560686168588507361534030051833058916347592172932262498857766114955245039357760034644709279247692495585280000000000000000000000000000000` is too huge to fit into any conventional datatype btw `float/double` will overflow too and due to accuracy error the result would be way off so now if I see it right you are doing `1!` ... `2000!` in your code so on 32/64 bit int you will overflow at 13!/21! also naive `O(n*mul_complexity)` method will be slow expect minutes or even hours to compute this — Spektre, Mar 01 '22 at 08:30
to remedy that you have to use fast factorial methods, and also fast multipication method so naive `O(log(n)^2)` will not do... [for "smaller" numbers you need Karatsuba and for bigger NTT based ones like Schönhage-Strassen](https://stackoverflow.com/q/18465326/2521214) ... bigint libs usually do this automatically ... btw also see this [how to measure complexity](https://stackoverflow.com/a/64784308/2521214) I think its very similar to what you are aiming for — Spektre, Mar 01 '22 at 08:34

curve fit of array with large values

0 Answers0