How to compute residuals of a point process in python

Question

I am trying to reproduce the work from http://jheusser.github.io/2013/09/08/hawkes.html in python except with different data. I have written code to simulate a Poisson process as well as the Hawkes process they describe.

To do the Hawkes model MLE I define the log likelihood function as

def loglikelihood(params, data):
    (mu, alpha, beta) = params
    tlist = np.array(data)
    r = np.zeros(len(tlist))
    for i in xrange(1,len(tlist)):
        r[i] = math.exp(-beta*(tlist[i]-tlist[i-1]))*(1+r[i-1])
    loglik  = -tlist[-1]*mu
    loglik = loglik+alpha/beta*sum(np.exp(-beta*(tlist[-1]-tlist))-1)
    loglik = loglik+np.sum(np.log(mu+alpha*r))
    return -loglik

Using some dummy data, we can compute the MLE for the Hawkes process with

atimes=[58.98353497,   59.28420225,   59.71571013,   60.06750179,   61.24794134,
61.70692463,   61.73611983,   62.28593814,   62.51691723,   63.17370423
,63.20125152,   65.34092403,  214.24934446,  217.0390236,   312.18830525,
319.38385604,  320.31758188,  323.50201334,  323.76801537,  323.9417007]

res = minimize(loglikelihood, (0.01, 0.1,0.1),method='Nelder-Mead',args = (atimes,))
print res

However, I don't know how to do the following things in python.

How can I do the equivalent of evalCIF to get a similar fitted versus empirical intensities plot as they have?
How can I compute the residuals for the Hawkes model to make the equivalent of the QQ plot they have. They say they use an R package called ptproc but I can't find a python equivalent.

Are you sure this is a programming only question? Could you at least say what it is you're trying to do exactly without forcing everyone to read the documentation of the code you're referring to? As a fellow scientist I'd advise to maybe get in touch with the authors for some help if you need it. But doubt they'll translate their code into Python for you. — Aleksander Lidtke, Jul 16 '14 at 18:25
@AleksanderLidtke I would like to compare how well the Hawkes model and the Poisson model fit the data I have. They seemed to have a good way of doing this but it is all in R. I would like to reproduce it in my data in python. The evalCIF code seems somehow to compare the empirical intensites (that is the data) with the fitted model. — Simd, Jul 16 '14 at 18:39
I know statsmodels does QQ plots but this doesn't help create the residuals sadly. — Simd, Jul 16 '14 at 18:41
Alright, why not just fit a model through the data you have and compare the empirical data to it? Do you have such a model at all? If not there are ways around it, I've found Gaussian mixtures to be really good at fitting them through literally anything and there seems to be an implementation in Python (untested by myself): http://scikit-learn.org/0.5/modules/gmm.html — Aleksander Lidtke, Jul 16 '14 at 18:45
@AleksanderLidtke The two models that I am testing are the Poisson process (which has only one parameter, the intensity) and the Hawkes model, which has three. The code I give in the question shows how to compute the MLE for the Hawkes model. The only way I know how to compare those models to the empirical data is by either plotting the thing they do with evalCIF or computing the residuals and doing a QQ plot. But I can't work out how to do it in python. Are Gaussian mixtures relevant for self-exciting point processes? I am no expert on this — Simd, Jul 16 '14 at 18:53
Hmm looking through the paper Gaussian mixture seems like an overkill. I'll have a look at it in a while. Mainly because I'm a Bitcoin fan ;) — Aleksander Lidtke, Jul 16 '14 at 19:38
@AleksanderLidtke Thank you. I am sure lots of people would love to be able to do what they did in python too. — Simd, Jul 16 '14 at 19:40
@user2179021 so I understand now you want to fit a function through `atimes` and `conditionalIntensities` by finding the best `alpha`, `beta` and `mu`? — Aleksander Lidtke, Jul 16 '14 at 20:54
@AleksanderLidtke I have data and I can find the best alpha, beta and gamma using minimize. What I would like to do, after having done that, is see how good the fit is. This is what he manages to do by the QQ plot, for example. — Simd, Jul 16 '14 at 21:09
Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/57446/discussion-between-aleksander-lidtke-and-user2179021). — Aleksander Lidtke, Jul 16 '14 at 21:09

Aleksander Lidtke · Answer 1 · 2014-07-20T10:16:34.310

OK, so first thing that you may wish to do is to plot the data. To keep it simple I've reproduced this figure as it only has 8 events occurring so it's easy to see the behaviour of the system. The following code:

import numpy as np
import math, matplotlib
import matplotlib.pyplot
import matplotlib.lines

mu = 0.1 # Parameter values as found in the article http://jheusser.github.io/2013/09/08/hawkes.html Hawkes Process section.
alpha = 1.0
beta = 0.5

EventTimes = np.array([0.7, 1.2, 2.0, 3.8, 7.1, 8.2, 8.9, 9.0])

" Compute conditional intensities for all times using the Hawkes process. "
timesOfInterest = np.linspace(0.0, 10.0, 100) # Times where the intensity will be sampled.
conditionalIntensities = [] # Conditional intensity for every epoch of interest.
for t in timesOfInterest:
     conditionalIntensities.append( mu + np.array( [alpha*math.exp(-beta*(t-ti)) if t > ti else 0.0 for ti in EventTimes] ).sum() ) # Find the contributions of all preceding events to the overall chance of another one occurring. All events that occur after t have no contribution.

" Plot the conditional intensity time history. "
fig = matplotlib.pyplot.figure()
ax = fig.gca()

labelsFontSize = 16
ticksFontSize = 14

fig.suptitle(r"$Conditional\ intensity\ VS\ time$", fontsize=20)
ax.grid(True)
ax.set_xlabel(r'$Time$',fontsize=labelsFontSize)
ax.set_ylabel(r'$\lambda$',fontsize=labelsFontSize)
matplotlib.rc('xtick', labelsize=ticksFontSize) 
matplotlib.rc('ytick', labelsize=ticksFontSize)

eventsScatter = ax.scatter(EventTimes,np.ones(len(EventTimes))) # Just to indicate where the events took place.

ax.plot(timesOfInterest, conditionalIntensities, color='red', linestyle='solid', marker=None, markerfacecolor='blue', markersize=12)
 fittedPlot = matplotlib.lines.Line2D([],[],color='red', linestyle='solid', marker=None,  markerfacecolor='blue', markersize=12)

fig.legend([fittedPlot, eventsScatter], [r'$Conditional\ intensity\ computed\ from\    events$', r'$Events$'])
matplotlib.pyplot.show()

reproduces the figure pretty accurately, even though I've chosen the event epochs somewhat arbitrarily: Simple Hawkes Process

This can also be applied to the set of example set of data of 5000 trades by binning the data and treating every bin as an event. However, what happens now, every event has a slightly different weight as different number of trades occurs in every bin. This is also mentioned in the article in Fitting Bitcoin Trade Arrival to a Hawkes Process section with a proposed way to overcome this problem: The only difference to the original dataset is that I added a random millisecond timestamp to all trades that share a timestamp with another trade. This is required as the model requires to distinguish every trade (i.e. every trade must have a unique timestamp). This is incorporated in the following code:

import numpy as np
import math, matplotlib, pandas
import scipy.optimize
import matplotlib.pyplot
import matplotlib.lines

" Read example trades' data. "
all_trades = pandas.read_csv('all_trades.csv', parse_dates=[0], index_col=0) # All trades' data.
all_counts = pandas.DataFrame({'counts': np.ones(len(all_trades))}, index=all_trades.index) # Only the count of the trades is really important.
empirical_1min = all_counts.resample('1min', how='sum') # Bin the data so find the number of trades in 1 minute intervals.

baseEventTimes = np.array( range(len(empirical_1min.values)), dtype=np.float64) # Dummy times when the events take place, don't care too much about actual epochs where the bins are placed - this could be scaled to days since epoch, second since epoch and any other measure of time.
eventTimes = [] # With the event batches split into separate events.
for i in range(len(empirical_1min.values)): # Deal with many events occurring at the same time - need to distinguish between them by splitting each batch of events into distinct events taking place at almost the same time.
    if not np.isnan(empirical_1min.values[i]):
        for j in range(empirical_1min.values[i]):
            eventTimes.append(baseEventTimes[i]+0.000001*(j+1)) # For every event that occurrs at this epoch enter a dummy event very close to it in time that will increase the conditional intensity.

eventTimes = np.array( eventTimes, dtype=np.float64 ) # Change to array for ease of operations.

" Find a fit for alpha, beta, and mu that minimises loglikelihood for the input data. "
#res = scipy.optimize.minimize(loglikelihood, (0.01, 0.1,0.1), method='Nelder-Mead', args = (eventTimes,))
#(mu, alpha, beta) =  res.x
mu = 0.07 # Parameter values as found in the article.
alpha = 1.18
beta = 1.79

" Compute conditional intensities for all epochs using the Hawkes process - add more points to see how the effect of individual events decays over time. "
conditionalIntensitiesPlotting = [] # Conditional intensity for every epoch of interest.
 timesOfInterest = np.linspace(eventTimes.min(), eventTimes.max(), eventTimes.size*10) # Times where the intensity will be sampled. Sample at much higher frequency than the events occur at.
for t in timesOfInterest:
    conditionalIntensitiesPlotting.append( mu + np.array( [alpha*math.exp(-beta*(t-ti))   if t > ti else 0.0 for ti in eventTimes] ).sum() ) # Find the contributions of all preceding events to the overall chance of another one occurring. All events that occur after time of interest t have no contribution.

" Compute conditional intensities at the same epochs as the empirical data are known. "
 conditionalIntensities=[] # This will be used in the QQ plot later, has to have the same size as the empirical data.
for t in np.linspace(eventTimes.min(), eventTimes.max(), eventTimes.size):
    conditionalIntensities.append( mu + np.array( [alpha*math.exp(-beta*(t-ti)) if t > ti else 0.0 for ti in eventTimes] ).sum() ) # Use eventTimes here as well to feel the influence of all the events that happen at the same time.

" Plot the empirical and fitted datasets. "
fig = matplotlib.pyplot.figure()
ax = fig.gca()

labelsFontSize = 16
ticksFontSize = 14

fig.suptitle(r"$Conditional\ intensity\ VS\ time$", fontsize=20)
ax.grid(True)
ax.set_xlabel(r'$Time$',fontsize=labelsFontSize)
ax.set_ylabel(r'$\lambda$',fontsize=labelsFontSize)
matplotlib.rc('xtick', labelsize=ticksFontSize) 
matplotlib.rc('ytick', labelsize=ticksFontSize)

# Plot the empirical binned data.
ax.plot(baseEventTimes,empirical_1min.values, color='blue', linestyle='solid',   marker=None, markerfacecolor='blue', markersize=12)
empiricalPlot = matplotlib.lines.Line2D([],[],color='blue', linestyle='solid', marker=None, markerfacecolor='blue', markersize=12)

# And the fit obtained using the Hawkes function.
ax.plot(timesOfInterest, conditionalIntensitiesPlotting, color='red', linestyle='solid',   marker=None, markerfacecolor='blue', markersize=12)
fittedPlot = matplotlib.lines.Line2D([],[],color='red', linestyle='solid', marker=None, markerfacecolor='blue', markersize=12)

fig.legend([fittedPlot, empiricalPlot], [r'$Fitted\ data$', r'$Empirical\ data$'])
matplotlib.pyplot.show()

This generates the following fit to the plot: Hawkes Process with real trades' data All looking good but, when you look at the detail, you'll see that computing the residuals by simply taking one vector of the number of trades and subtracting the fitted one won't do since they have different lengths: It is possible, however, to extract the intensity at the same epochs as when it was recorded for the empirical data and then compute the residuals. This enables you to find quantiles of both empirical and fitted data and plot them against each other thus generating the QQ plot:

""" GENERATE THE QQ PLOT. """
" Process the data and compute the quantiles. "
orderStatistics=[]; orderStatistics2=[];
for i in range( empirical_1min.values.size ): # Make sure all the NANs are filtered out and both arrays have the same size.
    if not np.isnan( empirical_1min.values[i] ):
        orderStatistics.append(empirical_1min.values[i])
        orderStatistics2.append(conditionalIntensities[i])
orderStatistics = np.array(orderStatistics); orderStatistics2 = np.array(orderStatistics2);

orderStatistics.sort(axis=0) # Need to sort data in ascending order to make a QQ plot.    orderStatistics is a column vector.
 orderStatistics2.sort()

 smapleQuantiles=np.zeros( orderStatistics.size ) # Quantiles of the empirical data.
 smapleQuantiles2=np.zeros( orderStatistics2.size ) # Quantiles of the data fitted using the Hawkes process.
for i in range( orderStatistics.size ):
    temp = int( 100*(i-0.5)/float(smapleQuantiles.size) ) # (i-0.5)/float(smapleQuantiles.size) th quantile. COnvert to % as expected by the numpy  function.
    if temp<0.0:
        temp=0.0 # Avoid having -ve percentiles.
    smapleQuantiles[i] = np.percentile(orderStatistics, temp)
    smapleQuantiles2[i] = np.percentile(orderStatistics2, temp)

" Make the quantile plot of empirical data first. "
fig2 = matplotlib.pyplot.figure()
ax2 = fig2.gca(aspect="equal")

fig2.suptitle(r"$Quantile\ plot$", fontsize=20)
ax2.grid(True)
ax2.set_xlabel(r'$Sample\ fraction\ (\%)$',fontsize=labelsFontSize)
ax2.set_ylabel(r'$Observations$',fontsize=labelsFontSize)
matplotlib.rc('xtick', labelsize=ticksFontSize) 
matplotlib.rc('ytick', labelsize=ticksFontSize)

distScatter = ax2.scatter(smapleQuantiles, orderStatistics, c='blue', marker='o') # If these are close to the straight line with slope line these points come from a normal distribution.

ax2.plot(smapleQuantiles, smapleQuantiles, color='red', linestyle='solid', marker=None, markerfacecolor='red', markersize=12)
normalDistPlot = matplotlib.lines.Line2D([],[],color='red', linestyle='solid', marker=None, markerfacecolor='red', markersize=12)

fig2.legend([normalDistPlot, distScatter], [r'$Normal\ distribution$', r'$Empirical\ data$'])
 matplotlib.pyplot.show()

" Make a QQ plot. "
fig3 = matplotlib.pyplot.figure()
ax3 = fig3.gca(aspect="equal")

fig3.suptitle(r"$Quantile\ -\ Quantile\ plot$", fontsize=20)
ax3.grid(True)
ax3.set_xlabel(r'$Empirical\ data$',fontsize=labelsFontSize)
ax3.set_ylabel(r'$Data\ fitted\ with\ Hawkes\ distribution$',fontsize=labelsFontSize)
matplotlib.rc('xtick', labelsize=ticksFontSize) 
matplotlib.rc('ytick', labelsize=ticksFontSize)

distributionScatter = ax3.scatter(smapleQuantiles, smapleQuantiles2, c='blue', marker='x') # If these are close to the straight line with slope line these points come from a normal distribution.

ax3.plot(smapleQuantiles, smapleQuantiles, color='red', linestyle='solid', marker=None,     markerfacecolor='red', markersize=12)
normalDistPlot2 = matplotlib.lines.Line2D([],[],color='red', linestyle='solid',  marker=None, markerfacecolor='red', markersize=12)

fig3.legend([normalDistPlot2, distributionScatter], [r'$Normal\ distribution$', r'$Comparison\ of\ datasets$'])
matplotlib.pyplot.show()

This generates the following plots: quantile plot of empirical data QQ plot

The quantile plot of empirical data isn't exactly the same as in the article, I'm not sure why as I'm not great with statistics. But, from programming standpoint, this is how you can go about all this.

Thanks for this. I am not sure the method is right however. I think we have to rescale time and then do a QQ plot against the exponential distribution. This would explain why the plots aren't the same — Simd, Jul 20 '14 at 14:13
I really want your code to be in GitHub. Is it already in some GitHub repository? or Could you make one? — jeongmin.cha, Mar 06 '17 at 18:10
If you don't care, I want to make a repository for codes in this question and answer. These codes helped me a lot and I believe that these codes must be more improved by everyone (open-source mind). That's why I want these codes to exist in GitHub. — jeongmin.cha, Mar 08 '17 at 06:55
@jeongmin.cha Hi, sorry, been a bit busy so didn't read this until know. Thanks, I'm glad this was useful. In fact, I already have all this on my git. I'd be happy to collaborate on this, if you want. Here's the link: https://github.com/AleksanderLidtke/Hawkes-Process — Aleksander Lidtke, Mar 16 '17 at 00:28
@AleksanderLidtke That seems pretty good :) but I think it would be better if it improved further. I'll send a pull request in my free time. Thanks for letting me know your repository. — jeongmin.cha, Mar 17 '17 at 08:53
@jeongmin.cha I agree, there's **a lot** that could be done there. The code in the repo is just what I wrote for this answer. I thought I'd get to develop it further at some point but never had any stimulus, this could be it, thanks :) — Aleksander Lidtke, Mar 21 '17 at 00:55
@jeongmin.cha You can also have a look at this freshly open-sourced github repo https://github.com/X-DataInitiative/tick if you want to experiment Hawkes processes in Python :) Disclaimer: I'm one of the first contributor — user2015762, May 15 '17 at 12:51
Very late to the party but when I tried to run this the `scipy.optimize.minimize` call gives very different values to those seen in the blog post. Does anyone know why? — JMzance, Nov 05 '17 at 21:23
@JMzance this could have something to do with the version of scipy and convergence of the optimiser. Sorry, I don't remember what version of scipy I used to write this answer. How much *exactly* is _very_ different? — Aleksander Lidtke, Nov 10 '17 at 00:46
@AleksanderLidtke if I run the above verbatim the optimizer fails to converge at: `x: array([ 8.89e-01, 3.60e+05, 3.88e+05])` — JMzance, Nov 10 '17 at 10:09
@JMzance OK that is indeed different. I used Python 2.7.something for this, no idea about the version of scipy. What are you using? — Aleksander Lidtke, Nov 10 '17 at 10:35
@JMzance OK so we definitely used different versions of scipy for this. The problem you're experiencing is that the default optimiser settings no longer converge. If you choose different settings, it'll probably converge. I seem to recall that I used 0.18 or an even earlier version for this. Sorry I can't be of much help but it's a really busy period for me... — Aleksander Lidtke, Nov 12 '17 at 00:36

How to compute residuals of a point process in python

1 Answers1