4

I am new to Python and try to modify a pair trading script that I found here: https://github.com/quantopian/zipline/blob/master/zipline/examples/pairtrade.py

The original script is designed to use only prices. I would like to use returns to fit my models and price for invested quantity but I don't see how do it.

I have tried:

  • to define a data frame of returns in the main and call it in run
  • to define a data frame of returns in the main as a global object and use where needed in the 'handle data'
  • to define a data frame of retuns directly in the handle data

I assume the last option to be the most appropriate but then I have an error with panda 'shift' attribute.

More specifically I try to define 'DataRegression' as follow:

DataRegression = data.copy()
DataRegression[Stock1]=DataRegression[Stock1]/DataRegression[Stock1].shift(1)-1
DataRegression[Stock2]=DataRegression[Stock2]/DataRegression[Stock2].shift(1)-1
DataRegression[Stock3]=DataRegression[Stock3]/DataRegression[Stock3].shift(1)-1
DataRegression = DataRegression.dropna(axis=0)

where 'data' is a data frame which contains prices, stock1, stock2 and stock3 column names defined globally. Those lines in the handle data return the error:

File "A:\Apps\Python\Python.2.7.3.x86\lib\site-packages\zipline-0.5.6-py2.7.egg\zipline\utils\protocol_utils.py", line 85, in __getattr__
return self.__internal[key]
KeyError: 'shift'

Would anyone know why and how to do that correctly?

Many Thanks, Vincent

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
VincentH
  • 1,009
  • 4
  • 13
  • 24
  • Is it the last line which is causing the exception? (`DataRegression = DataRegression.dropna(axis=0)` ?) – Andy Hayden Feb 11 '13 at 12:35
  • It is the second line which is causing the exception – VincentH Feb 11 '13 at 12:41
  • (That makes more sense with the error!) Does this mean that `DataRegression[Stock1].shift(1)` throws the same exception? Can you confirm the output of `type(DataRegression[Stock1])`? – Andy Hayden Feb 11 '13 at 12:52
  • Yes DataRegression[Stock1].shift(1) throws the same exception. The type is 'zipline.utils.protocol_utils.ndict' – VincentH Feb 11 '13 at 13:09
  • The lib zipline has changed the type I assume...DataRegression is not a panda data frame but a zipline object :( – VincentH Feb 11 '13 at 13:16
  • Is that the type of data also, perhaps `pd.DataFrame(data)` fixes this? – Andy Hayden Feb 11 '13 at 13:25
  • No the pandas DataFrame constructor does not work anymore and if I try to copy my data before they are turned to a zipline object I have UnboundLocalError. – VincentH Feb 11 '13 at 14:04

1 Answers1

2

This is an interesting idea. The easiest way to do this in zipline is to use the Returns transform which adds a returns field to the event-frame (which is an ndict, not a pandas DataFrame as someone pointed out).

For this you have to add the transform to the initialize method: self.add_transform(Returns, 'returns', window_length=1)

(make sure to add from zipline.transforms import Returns at the beginning).

Then, inside the batch_transform you can access returns instead of prices:

@batch_transform
def ols_transform(data, sid1, sid2):
    """Computes regression coefficient (slope and intercept)
    via Ordinary Least Squares between two SIDs.
    """
    p0 = data.returns[sid1]
    p1 = sm.add_constant(data.returns[sid2])
    slope, intercept = sm.OLS(p0, p1).fit().params

    return slope, intercept

Alternatively, you could also create a batch_transform to convert prices to returns like you wanted to do.

@batch_transform
def returns(data):
    return data.price / data.price.shift(1) - 1

And then pass that to the OLS transform. Or do this computation inside of the OLS transform itself.

HTH, Thomas

twiecki
  • 1,316
  • 8
  • 11
  • Thank you very much for your answer. I am not very familiar with @batch_transform so far – VincentH – VincentH Apr 17 '13 at 15:43