I'm writing a method for calculating the covariance of 2 to 8 time-series variables. I'm intending for the variables to be contained in list objects when they are passed to this method. The method should return 1 number, not a covariance matrix.
The method works fine the first time it's called. Anytime it's called after that, it returns a 0. An example is attached at the bottom, below my code. Any advice/feeback regarding the variable scope issues here would be greatly appreciated. Thanks!
p = [3,4,4,654]
o = [4,67,4,1]
class Toolkit():
def CovarianceScalar(self, column1, column2 = [], column3 = [], column4 = [],column5 = [],column6 = [],column7 = [],column8 = []):
"""Assumes all columns have length equal to Len(column1)"""
#If only the first column is passed, this will act as a variance function
import numpy as npObject
#This is a binary-style number that is assigned a value of 1 if one of the input vectors/lists has zero length. This way, the CovarianceResult variable can be computed, and the relevant
# terms can have a 1 added to them if they would otherwise go to 0, preventing the CovarianceResult value from incorrectly going to 0.
binUnityFlag2 = 1 if (len(column2) == 0) else 0
binUnityFlag3 = 1 if (len(column3) == 0) else 0
binUnityFlag4 = 1 if (len(column4) == 0) else 0
binUnityFlag5 = 1 if (len(column5) == 0) else 0
binUnityFlag6 = 1 if (len(column6) == 0) else 0
binUnityFlag7 = 1 if (len(column7) == 0) else 0
binUnityFlag8 = 1 if (len(column8) == 0) else 0
# Some initial housekeeping: ensure that all input column lengths match that of the first column. (Will later advise the user if they do not.)
lngExpectedColumnLength = len(column1)
inputList = [column2, column3, column4, column5, column6, column7, column8]
inputListNames = ["column2","column3","column4","column5","column6","column7","column8"]
for i in range(0,len(inputList)):
while len(inputList[i]) < lngExpectedColumnLength: #Empty inputs now become vectors of 1's.
inputList[i].append(1)
#Now start calculating the covariance of the inputs:
avgColumn1 = sum(column1)/len(column1) #<-- Each column's average
avgColumn2 = sum(column2)/len(column2)
avgColumn3 = sum(column3)/len(column3)
avgColumn4 = sum(column4)/len(column4)
avgColumn5 = sum(column5)/len(column5)
avgColumn6 = sum(column6)/len(column6)
avgColumn7 = sum(column7)/len(column7)
avgColumn8 = sum(column8)/len(column8)
avgList = [avgColumn1,avgColumn2,avgColumn3,avgColumn4,avgColumn5, avgColumn6, avgColumn7,avgColumn8]
#start building the scalar-valued result:
CovarianceResult = float(0)
for i in range(0,lngExpectedColumnLength):
CovarianceResult +=((column1[i] - avgColumn1) * ((column2[i] - avgColumn2) + binUnityFlag2) * ((column3[i] - avgColumn3) + binUnityFlag3 ) * ((column4[i] - avgColumn4) + binUnityFlag4 ) *((column5[i] - avgColumn5) + binUnityFlag5) * ((column6[i] - avgColumn6) + binUnityFlag6 ) * ((column7[i] - avgColumn7) + binUnityFlag7)* ((column8[i] - avgColumn8) + binUnityFlag8))
#Finally, divide the sum of the multiplied deviations by the sample size:
CovarianceResult = float(CovarianceResult)/float(lngExpectedColumnLength) #Coerce both terms to a float-type to prevent return of array-type objects.
return CovarianceResult
Example:
myInst = Toolkit() #Create a class instance.
First execution of the function:
myInst.CovarianceScalar(o,p)
#Returns -2921.25, the covariance of the numbers in lists o and p.
Second time around:
myInst.CovarianceScalar(o,p)
#Returns: 0.0