1

I have the following app that optimize the following pb. The code works but i found it a little bit slow. Any idea of performance improvements ( without writing c code ) that can be done to better use python, numpy and scipy? It seems me that the interpolate function is the main time consuming part.

from scipy.optimize import leastsq
from scipy.interpolate import interp1d
import timeit


class Bond(object):
  def __init__(self, years, cpn):
    self.years = years 
    self.coupon = cpn
    self.cashflows = [(0.0, -1.0)]
    self.cashflows.extend([(float(i),self.coupon) for i in range(1,self.years)])
    self.cashflows.append((float(self.years), 1.0 + self.coupon))

  def pv(self, market):
    return sum([cf[1] * market.df(cf[0]) for cf in self.cashflows])

class Market(object):
  def __init__(self, instruments):
    self.instruments = sorted(
        instruments, key=lambda instrument : instrument.cashflows[-1][0])
    self.knots = [0.0]
    self.knots.extend([inst.cashflows[-1][0] for inst in self.instruments])
    self.dfs = [1.0]
    self.dfs.extend([1.0] * len(self.instruments))
    self.interp = interp1d(self.knots, self.dfs)

  def df(self, day):
    return self.interp(day)

  def calibrate(self):
    leastsq(self.__target, self.dfs[1:])

  def __target(self, x):
    self.dfs[1:] = x
    self.interp = interp1d(self.knots, self.dfs)
    return [bond.pv(self) for bond in self.instruments]


def main():
  instruments = [Bond(i, 0.02) for i in xrange(1, numberOfInstruments + 1)]
  market = Market(instruments)
  market.calibrate()
  print('CALIBRATED')

numberOfTimes = 10
numberOfInstruments = 50
print('%.2f' % float(timeit.timeit(main, number=numberOfTimes)/numberOfTimes))
Dave
  • 173
  • 2
  • 7
  • one more things to notice is for example once the model has been calibrated, if i call calibrate once again it takes 4s for the function to complete ! I would have thought that the optimizer stops at first step. – Dave Jun 08 '11 at 09:49

2 Answers2

2

You should try to vectorize the summations and the calls to the interpolation routine. For example, like this:

import numpy as np

class Bond(object):
  def __init__(self, years, cpn):
    self.years = years
    self.coupon = cpn

    self.cashflows = np.zeros((self.years + 1, 2))
    self.cashflows[:,0] = np.arange(self.years + 1)
    self.cashflows[:,1] = self.coupon
    self.cashflows[0,:] = 0, -1
    self.cashflows[-1,:] = self.years, 1.0 + self.coupon

  def pv(self, market):
    return (self.cashflows[:,1] * market.df(self.cashflows[:,0])).sum()

which seems to give a ~ 10x speedup. You can also replace the knots and dfs lists in Market with arrays in a similar way.

The reason why the re-calibration takes time, is that leastsq has to verify again that it really sits at a local minimum. This requires numerical differentiation of the target function, which does take time, as you have many free variables. The optimization problem is fairly easy, so it converges in a few steps, which means verification of the minimum takes nearly as much time as solving the problem.

pv.
  • 33,875
  • 8
  • 55
  • 49
  • It gives me a great improvement ( ~20x ) . Great sol. But i don't get the other hint you gave me about changing knots and dfs with arrays. Which part there could be vectorize (except the structure itself )? – Dave Jun 08 '11 at 13:55
0

@pv's answer is most likely right, but this answer shows a simple way to be sure, and to show if there is anything further you could do.

Community
  • 1
  • 1
Mike Dunlavey
  • 40,059
  • 14
  • 91
  • 135