I am trying to fit a simple straight line y=mx+c
type to some synthetic data using parallel-tempered mcmc
. My goal is to just be able to understand how to use it, so that I can apply to some more complex models later. The example I am trying is the replica of what has already been done in a simple emcee code :
http://dfm.io/emcee/current/user/line/
but instead of using mcmc, I want to use parallel-tempered mcmc:
http://dfm.io/emcee/current/user/pt/
Here is a working code:
import numpy as np
from emcee import PTSampler
import emcee
# Choose the "true" parameters.
m_true = -0.9594
b_true = 4.294
f_true = 0.534
# Generate some synthetic data from the model.
N = 50
x = np.sort(10*np.random.rand(N))
yerr = 0.1+0.5*np.random.rand(N)
y = m_true*x+b_true
y += np.abs(f_true*y) * np.random.randn(N)
y += yerr * np.random.randn(N)
def lnlike(theta, x, y, yerr):
m, b, lnf = theta
model = m * x + b
inv_sigma2 = 1.0/(yerr**2 + model**2*np.exp(2*lnf))
return -0.5*(np.sum((y-model)**2*inv_sigma2 - np.log(inv_sigma2)))
def lnprior(theta):
m, b, lnf = theta
if -5.0 < m < 0.5 and 0.0 < b < 10.0 and -10.0 < lnf < 1.0:
return 0.0
return -np.inf
def lnprob(theta, x, y, yerr):
lp = lnprior(theta)
if not np.isfinite(lp):
return -np.inf
return lp + lnlike(theta, x, y, yerr)
import scipy.optimize as op
nll = lambda *args: -lnlike(*args)
result = op.minimize(nll, [m_true, b_true, np.log(f_true)], args=(x, y, yerr))
m_ml, b_ml, lnf_ml = result["x"]
init = [0.5, m_ml, b_ml, lnf_ml]
ntemps = 10
nwalkers = 100
ndim = 3
from multiprocessing import Pool
pos = np.random.uniform(low=-1, high=1, size=(ntemps, nwalkers, ndim))
for i in range(ntemps):
#initialize parameters near scipy optima
pos[i:,] = np.array([result["x"] + 1e-4*np.random.randn(ndim) for i in range(nwalkers)])
pool = Pool(processes=4)
sampler=PTSampler(ntemps,nwalkers, ndim, lnlike, lnprior, loglargs=(x, y, yerr), pool=pool)# args=(x, y, yerr))
#burn-in
sampler.run_mcmc(pos, 1000)
sampler.reset()
sampler.run_mcmc(pos, 10000, thin=10)
samples = sampler.chain.reshape((-1, ndim))
print('Number of posterior samples is {}'.format(samples.shape[0]))
#print best fit value together with errors
print(map(lambda v: (v[1], v[2]-v[1], v[1]-v[0]),
zip(*np.percentile(samples, [16, 50, 84],
axis=0))))
import corner
fig = corner.corner(samples, labels=["$m$", "$b$", "$\ln\,f$"],
truths=[m_true, b_true, np.log(f_true)])
fig.savefig("triangle.png")
The only problem when running this code is I get optimal parameters value which are way off the true values. And increasing the number of walkers or samples is not helping in any sense. Can anyone please advice why tempered-mcmc
is not working here?
Update:
I found out a useful package called ptemcee
(https://pypi.org/project/ptemcee/#description), although the documentation of this package is non-existent. It seems that this package might be useful, any help on how to implement the same linear fitting with this package would also be highly appreciated.