0

The first block of code shown below works; my issue is in trying to adapt that block of code to my problem (defined just below it).

I am trying to use SCIPY to minimize the function chisq below. The entire code is basically a chain of functions (such as expectperbin, which calls a distribution function to integrate over); the code below has obsperbin, parameterguess, countperbin, and initial parameters, all of which are pre-defined. Only the top of the function chain is shown:

from scipy.optimize import minimize
from scipy.stats import chisquare

def chisq( args ):
    ## first subscript [0] gives chi-squared value, [1] gives 0 ≤ p-value ≤ 1
    return chisquare( obsperbin , expectperbin( args ))[0]

def miniz( chisq , parameterguess ):
    ## optimization routine to minimize Chi Square (or negative p-value)
    globmin = minimize( chisq , parameterguess)
    while globmin.success == False:
        ## self-correcting mechanism if 'success test' fails

        try:
            globmin = minimize( chisq , parameterguess)
            print("ERROR:   MINIMIZE LOOPING AGAIN")
            break
        except globmin.success == True:
            print("TA DAA")
            break

    return globmin

res = miniz( chisq, [initial_mu , initial_sigma] ) ## FULL OPTIMIZED RESULT
print(res)

The code above finds optimized values of mu and sigma (given an initial guess for each) that minimizes chisq, and it works as expected. So I am now trying to generalize the code for the case when multiple distributions are defined. I use an unchangeable function input pickdist to choose which pre-defined distribution to calculate expectperbin. My attempt is below:

def chisq( pickdist , args ):
    obsperbin = countperbin( pickdist = pickdist )
    expperbin = expectperbin( pickdist , args )
    return chisquare( obsperbin , expectperbin( pickdist , args ))[0]

def miniz( pickdist ):
    if pickdist == 1:
        parameterguess = paramguess1
    elif pickdist == 2:
        parameterguess = paramguess2
    elif pickdist == 3:
        parameterguess = paramguess3
    else:
        raise ValueError(errmsg)
    globmin = minimize( chisq , parameterguess, args = (pickdist))
    return globmin

Running that code creates the error message

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

and finds the bug in the line of code that reads obsperbin = countperbin ... (from above, under chisq) and in a line of code from below that reads if pickdist == 1:

def countperbin( pickdist ):
## counts multiplicity of observed values per bin via dataset
if pickdist == 1:
    dataset = data
elif pickdist == 2:
    dataset = logdata
elif pickdist == 3:
    dataset = logdata
else:
    raise ValueError(errmsg)
## I can delete the code below if it's too long
bincount = []
    subintervals = binbounder( pickdist )
    for jndex in range(len( subintervals )):
        if jndex != len( subintervals ) - 1:
            summ = 0
            for value in dataset:
                if value > subintervals [ jndex ] and value <= subintervals [ jndex + 1 ]:
                    summ += 1
            bincount.append(summ)
        if jndex == len( subintervals ) - 1:
            pass
    return bincount

obsperbin1 = countperbin( 1 )
    ... ## rest not shown to keep post short but can add if requested

I tried debugging this code after reading this post on SO but without success. How do I apply what the error message is telling me to my functions?

  • Well, from the error message, it seems that pickdist is a numpy array but your code seems to expect a builtin `int`. Can you elaborate on that? Please, show us `pickdist`. – Stefan Zobel Mar 19 '17 at 10:40
  • In every case, `pickdist` is an input for chained functions. I put more of the code in the last block to show an example. As for the `expectperbin` function, each input `pickdist` results in an output that is a lambda function of `x` and args `a,b`. –  Mar 19 '17 at 10:47
  • Would it be better to make `pickdist = [1,2,3]` over which all distributions are acounted for, or perhaps a a function `pickdist` that outputs `1` or `2` or `3`? –  Mar 19 '17 at 10:59
  • I get that. For me, it looks like that you intend `pickdist` to be an `int` having the 3 possible values `1, 2, 3` (an [enum](https://docs.python.org/3/library/enum.html) might be a better choice btw). However, from the error message, you really seem to pass a numpy array instead of an int. – Stefan Zobel Mar 19 '17 at 11:05
  • I am not sure why it would be passed as a numpy array, though my guess would be that the args = pickdist in the function miniz that uses the scipy module is the culprit. I will play around with it and update tomorrow. Thank you for the tip about enum, I had not thought of it being used here. Since I am using boolean logic for those parts, why is it a better choice? –  Mar 19 '17 at 11:09
  • As for enum's advantages, in Python, I'd say mostly value safety. But you should read [PEP 435](https://www.python.org/dev/peps/pep-0435/) for the official answer. – Stefan Zobel Mar 19 '17 at 11:17
  • I realized that my code is not putting pickdist as a numpy array but rather iterating through values in if/elif statements in my subfunctions. I asked a [new question](http://stackoverflow.com/q/42898612/7345804) to clarify the problem more clearly and abstractly. –  Mar 20 '17 at 08:51

0 Answers0