2

I am using PyMC3 for parameter estimation using a particular likelihood function which has to be defined. I googled it and found out that I should use the densitydist method for implementing the user defined likelihood functions but it is not working. How to incorporate a user defined likelihood function in PyMC3 and to find out the maximum a posteriori (MAP) estimate for my model? My code is given below. Here L is the analytic form of my Likelihood function. I have some observational data for the radial velocity(vr) and postion (r) for some objects, which is imported from excel file.

data_ = np.array(pandas.read_excel('aaa.xlsx',header=None))
gamma=3.77;
G = 4.302*10**-6;
rmin = 3.0;
R = 95.7;
vr=data_[:,1];
r= data_[:,0];
h= np.pi;

class integrateOut(theano.Op):
 def __init__(self,f,t,t0,tf,*args,**kwargs):
    super(integrateOut,self).__init__()
    self.f = f
    self.t = t
    self.t0 = t0
    self.tf = tf

 def make_node(self,*inputs):
    self.fvars=list(inputs)

    try:
        self.gradF = tt.grad(self.f,self.fvars)
    except:
        self.gradF = None
    return theano.Apply(self,self.fvars,[tt.dscalar().type()])

 def perform(self,node, inputs, output_storage):

    args = tuple(inputs)
    f = theano.function([self.t]+self.fvars,self.f)
    output_storage[0][0] = quad(f,self.t0,self.tf,args=args)[0]

 def grad(self,inputs,grads):
    return [integrateOut(g,self.t,self.t0,self.tf)(*inputs)*grads[0] \
        for g in self.gradF] 

basic_model = pm.Model()
with basic_model:
   M=[]
   beta=[]
   interval=0.01*10**12
   M=pm.Uniform('M', 
               lower=0.5*10**12,upper=3.50*10**12,transform='interval')
   beta=pm.Uniform('beta',lower=2.001,upper=2.999,transform='interval')
   gamma=3.77
   logp=[]
   arr=[]
   vnew=[]
   rnew=[]
   theta = tt.scalar('theta')
   beta = tt.scalar('beta')
   z = tt.cos(theta)**(2*( (gamma/(beta - 2)) - 3/2) + 3)    
   intZ = integrateOut(z,theta,-(np.pi)/2,(np.pi)/2)(beta)
   gradIntZ = tt.grad(intZ,[beta])
   funcIntZ = theano.function([beta],intZ)
   funcGradIntZ = theano.function([beta],gradIntZ)  
   for j in np.arange(0,59,1):
     vnew.append(vr[j]+(0.05*vr[j]*float(dm.Decimal(rm.randrange(1, 
     20))/10)));
     rnew.append(r[j]+(0.05*r[j]*float(dm.Decimal(rm.randrange(1, 
     20))/10)));
   vn=np.array(vnew)
   rn=np.array(rnew)
   for beta in np.arange (2.01,2.99,0.01):
     for M in np.arange (0.5,2.50,0.01):
         i=np.arange(0,59,1)
         q =( gamma/(beta - 2)) - 3/2
         B = (G*M*10**12)/((beta -2 )*( R**(3 - beta)))
         K = (gamma - 3)/((rmin**(3 - gamma))*funcIntZ(beta)*m.sqrt(2*B))
         logp= -np.log(K*((1 -(( 1/(2*B) )*((vn[i]**2)*rn[i]**(beta - 
                          2))))**(q+1))*(rn[i]**(1-gamma +(beta/2))))
         arr.append(logp.sum())
   def logp_func(rn,vn):
      return min(np.array(arr))  
   logpvar = pm.DensityDist("logpvar", logp_func, observed={"rn": rn,"vn":vn})
    start = pm.find_MAP(model=basic_model)
    step = pm.Metropolis()
    basicmodeltrace = pm.sample(10000, step=step, 
    start=start,random_seed=1,progressbar=True)
    print(pm.summary(basicmodeltrace))
    map_estimate = pm.find_MAP(model=basic_model)
    print(map_estimate)

I am getting the following error message:

 ValueError: Cannot compute test value: input 0 (theta) of Op 
 Elemwise{cos,no_inplace}(theta) missing default value.  
 Backtrace when that variable is created:

I am unable to get the output since the numerical integration is not working. I have used custom theano op for numerical integration code which i got from Custom Theano Op to do numerical integration . The integration works if I run it seperately inputting a particular value of beta, but not within the model.

Lekshmi
  • 21
  • 2
  • Could you post a complete working example? Do you really want a MAP, why don't you just use `pm.sample()` and get a full posterior? – aloctavodia Jan 08 '18 at 14:39
  • @aloctavodia Sir, I have done some edits in my code and have posted a complete working example. My problem with logp is resolved but i am unable to get an accurate parameter estimate. – Lekshmi Jan 09 '18 at 08:59
  • I do not have access to the file 'aaa.xlsx', and thus I can not run your model. What do you get if you just run your model by doing `basicmodeltrace = pm.sample(1000)`. Why are you using Uniform distributions as priors? Unless the boundaries of the Uniform prior has some physical meaning using something like Gaussian with large variance is usually a much better choice. – aloctavodia Jan 09 '18 at 14:06
  • This is actually a problem in Astrophysics which impose a physical meaning for the uniform priors. I have run the model using 'basicmodeltrace = pm.sample(1000)' . I am getting the same result as before.I am unable to fetch the M and beta corresponding to the minimum of logp. The M array and beta array are simply the mean of boundary values. Even if I comment out the likelihood function part, I will get the same output except a value for logp. So does this means that the priors are not input to the likelihood function? Is the for loops for M and beta neccessary? @aloctavodia – Lekshmi Jan 10 '18 at 04:31
  • Sorry I did not notice this before, but you don't have to loop over M and beta. You have to let PyMC3 sample the values of M and beta (instead of performing a grid search like you are doing). Also your `logp_func` should return tt.sum(logp) (you don't have to compute `min`). Check [this](https://github.com/pymc-devs/pymc3/blob/f9917a2b021dc84215ca0ea9ac2f88e0e7f34a74/pymc3/examples/arbitrary_stochastic.py#L6-L18) example. Also you may need to replace `np.log` by `tt.log` and m.sqrt(2*B) by (2*B)**0.5. – aloctavodia Jan 10 '18 at 18:56
  • @aloctavodia Sir, thank you for the help. It worked but without the integral value li(beta). The integration shows an error message : `File "C:\Users\mypc\Anaconda3\Anaconda_3\lib\site-packages\scipy\integrate\quadpack.py", line 388, in _quad return _quadpack._qagse(func,a,b,args,full_output,epsabs,epsrel,limit) TypeError: must be real number, not TensorVariable` – Lekshmi Jan 11 '18 at 07:22
  • the `beta` you are passing to integrad and li is a `TensorVariable`, but those functions work with `floats`. you have to define a Custom Theano Op, check [here](https://stackoverflow.com/questions/42678490/custom-theano-op-to-do-numerical-integration) and [here](http://docs.pymc.io/advanced_theano.html#writing-custom-theano-ops) – aloctavodia Jan 11 '18 at 17:09
  • @aloctavodia Sir, please look at my edited code. I am able to get an integrated value if a particular beta value is given as input. When I try to incorporate it inside the model, an error message is coming `Cannot compute test value: input 0 (theta) of Op Elemwise{cos,no_inplace}(theta) missing default value. Backtrace when that variable is created:` I think the problem is because the beta is not getting input as a number in each iteration. Since I am completely new to python and PyMC3 , I am stuck – Lekshmi Jan 12 '18 at 11:58

1 Answers1

0

I made a few changes to your code, this still does not work, but I hope it is closer to a solution. Please check this thread, as someone is trying so solve essentially the same problem.

class integrateOut(theano.Op):
    def __init__(self, f, t, t0, tf,*args, **kwargs):
        super(integrateOut,self).__init__()
        self.f = f
        self.t = t
        self.t0 = t0
        self.tf = tf

    def make_node(self, *inputs):
        self.fvars=list(inputs)
        try:
            self.gradF = tt.grad(self.f, self.fvars)
        except:
            self.gradF = None
        return theano.Apply(self, self.fvars, [tt.dscalar().type()])

    def perform(self,node, inputs, output_storage):

        args = tuple(inputs)
        f = theano.function([self.t] + self.fvars,self.f)
        output_storage[0][0] = quad(f, self.t0, self.tf, args=args)[0]

    def grad(self,inputs,grads):
        return [integrateOut(g, self.t, self.t0, self.tf)(*inputs)*grads[0] \
                for g in self.gradF]


gamma = 3.77
G = 4.302E-6
rmin = 3.0
R = 95.7
vr = data[:,1]
r = data[:,0]
h = np.pi
interval =  1E10

vnew = []
rnew = []
for j in np.arange(0,59,1):
    vnew.append(vr[j]+(0.05*vr[j] * float(dm.Decimal(rm.randrange(1, 20))/10)))
    rnew.append(r[j]+(0.05*r[j] * float(dm.Decimal(rm.randrange(1, 20))/10)))
vn = np.array(vnew)
rn = np.array(rnew)

def integ(gamma, beta, theta):
    z = tt.cos(theta)**(2*((gamma/(beta - 2)) - 3/2) + 3)    
    return integrateOut(z, theta, -(np.pi)/2, (np.pi)/2)(beta)

with pm.Model() as basic_model:

    M = pm.Uniform('M', lower=0.5*10**12, upper=3.50*10**12)
    beta = pm.Uniform('beta', lower=2.001, upper=2.999)
    theta = pm.Normal('theta', 0, 10**2)

    def logp_func(rn,vn):
        q = (gamma/(beta - 2)) - 3/2
        B = (G*M*1E12) / ((beta -2 )*(R**(3 - beta)))
        K = (gamma - 3) / ((rmin**(3 - gamma)) * integ(gamma, beta, theta) * (2*B)**0.5)
        logp = - np.log(K*((1 -((1/(2*B))*((vn**2)*rn**(beta - 
                        2))))**(q+1))*(rn**(1-gamma +(beta/2))))
        return logp.sum()

    logpvar = pm.DensityDist("logpvar", logp_func, observed={"rn": rn,"vn":vn})
    start = pm.find_MAP()
    #basicmodeltrace = pm.sample()
    print(start)
aloctavodia
  • 2,040
  • 21
  • 28