0

I have a Python function processing some real data and need to find global maximum of the function.

I tried to solve this by using the scipy.optimize.minimize function, but I noticed that the results I get vary a lot depending on the method used (the Powell method gives by far the best results in my case), but most importantly I noticed that if I run the minimize function repeatedly with the same data but different starting values (chosen randomly) I also get completely different results:

  1. -14850.0 [-0.14850000000000002%]
  2. 70490.0 [0.7049000000000001%]
  3. 115480.0 [1.1548%]
  4. 81800.0 [0.818%]
  5. 60330.0 [0.6033000000000001%]
  6. -70070.0 [-0.7007%]
  7. 7560.0 [0.0756%]
  8. -19940.0 [-0.1994%]
  9. 51430.0 [0.5143%]
  10. 51430.0 [0.5143%]
  11. -5730.0 [-0.0573%]
  12. 167160.0 [1.6716000000000002%]
  13. 168060.0 [1.6806%]
  14. -20090.0 [-0.2009%]

As you can see, the difference between the best (168060.0/1.6806%) and worst result (-70070.0/-0.7007%) is quite big. I can simply run the minimize function many times (maybe 100 or 1000 times?) and then choose the best result.

But I was also wondering: isn't there some tool that would convert my Python function into a mathematical function so that a derivation can be done? And then I suppose it would be possible to find the result more effectively.

@jit (nopython=True)
def FindFunctionForTestCHA06(coefs, value, numberOfGames):
    for row in data:
        result = row[0]
        favOdds = row[1]
        dogOdds = row[2]

        housePoints = np.zeros(12)
    
        coefsFav = np.array( [coefs[347 + int(row[31])],
        coefs[347 + int(row[32])],
        coefs[347 + int(row[33])],
        coefs[347 + int(row[34])],
        coefs[347 + int(row[35])],
        coefs[347 + int(row[36])],
        coefs[347 + int(row[37])],
        coefs[347 + int(row[38])],
        coefs[347 + int(row[39])],
        coefs[347 + int(row[40])],
        coefs[347 + int(row[41])],
        coefs[347 + int(row[42])] ])

        coefsDog = np.array([coefs[359 + int(row[31])],
        coefs[359 + int(row[32])],
        coefs[359 + int(row[33])],
        coefs[359 + int(row[34])],
        coefs[359 + int(row[35])],
        coefs[359 + int(row[36])],
        coefs[359 + int(row[37])],
        coefs[359 + int(row[38])],
        coefs[359 + int(row[39])],
        coefs[359 + int(row[40])],
        coefs[359 + int(row[41])],
        coefs[359 + int(row[42])] ])
                
        dayOfWeekFav = coefs[371 + int(row[43])]
        dayOfWeekDog = coefs[378 + int(row[43])]
        
        for p in range (0, 14):
            blah = int(row[3 + p] - 1)
            house = int(row[17+p] - 1)
            housePoints [house] += coefs [p*12 + blah] * coefs [168 + (p*12) + house]
        
        favHousePoints = 0
        dogHousePoints = 0
        for h in range (0, 12):
            favHousePoints += housePoints[h] * houseKoefsFav[h]
            
        for h in range (0, 12):
            dogHousePoints += housePoints[h] * houseKoefsDog[h]     

        favPoints = favHousePoints * dayOfWeekFav
        dogPoints = dogHousePoints * dayOfWeekDog
        
        if result == 1:
            if favPoints >= dogPoints:
                value += favOdds*1000
        elif result == 2:
            if dogPoints > favPoints:
                value += dogOdds*1000
                
    return value*-1
    
global data
data = np.load(numpyFilename, allow_pickle=True)
start = np.random.sample(386)
value = int(len(data)*-1000)
numberOfGames = int(len(data))

res = minimize(FindFunctionForTestCHA06, start, args = (value, numberOfGames), method='Powell', tol=1e-10, options={'maxiter':99999999999, 'xtol': 0.00000001, 'ftol': 0.00000001, 'disp': True, 'return_all': True})
 
Kraigolas
  • 5,121
  • 3
  • 12
  • 37
velkyvont
  • 23
  • 3
  • 1
    *"But I was also wondering: isn't there some tool that would convert my Python function into a mathematical function so that a derivation can be done?"* If you already know what family of functions you want, this is an extensively-studied optimisation problem. For instance, perhaps you know that your function can be written as `y = a sin(b x + c) + d`, and you want to find the best parameters `a,b,c,d`. – Stef Aug 23 '22 at 16:04
  • But if you don't know at all what family of function you want to fit your curve with, then there is not much we can do. – Stef Aug 23 '22 at 16:04
  • See for instance these fit functions: [scipy.curve_fit](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html); [numpy.Polynomial.fit](https://numpy.org/doc/stable/reference/generated/numpy.polynomial.polynomial.Polynomial.fit.html); – Stef Aug 23 '22 at 16:09
  • See also these questions: [How do I fit a sine curve to my data with pylab and numpy?](https://stackoverflow.com/questions/16716302/how-do-i-fit-a-sine-curve-to-my-data-with-pylab-and-numpy); [Fitting sin curve using python](https://stackoverflow.com/questions/47085244/fitting-sin-curve-using-python); – Stef Aug 23 '22 at 16:11
  • @Stef That's the problem. I have only the Python function (see above) but don't know what family of function that is and don't know how to find out. So should I perhaps run the function many times, plot the results and see if it resembles any function? – velkyvont Aug 23 '22 at 16:20
  • Yes you definitely should! – Stef Aug 23 '22 at 16:22
  • There is no way to solve this problem if you know nothing about the function. – Riccardo Bucco Aug 23 '22 at 16:22
  • @Stef Maybe I'm completely wrong (I'm not good at math), but I think I'm trying to do a multivariate logistic regression with custom loss function. – velkyvont Aug 23 '22 at 16:42
  • @velkyvont It doesn't have to be as complicated as that. For instance, if the model function that you are trying to fit to your data is a polynomial, then linear regression is sufficient. – Stef Aug 23 '22 at 18:36

0 Answers0