Running this in Jupyter (iPython), I have a dictionary where the key is a compound and the value is an instance of the Model class. I'm iterating through my list of compounds and trying to add all of the individual models to the model list attribute of the Model class instance for that compound.
It's working perfect as I'm summing up if the model is good or not D_compound_Models[cpd].summation += accuracy
but when I D_compound_Models[cpd].models.append(model)
it looks like it is appending ALL of the models cumulatively for all the compounds in each instance.
When I get the len(D_compound_Model[cpd].models) it's way more than len(lpo)
which should be 780
. I figured out how to fix it by adding to a list outside the lpo for-loop and then adding the attribute to the class at the end but that's not the way I want to do it.
Why is the list appending operation not working as it should?
Especially since the cumulative sum is working correctly...
Here's my class
class Models:
def __init__(self,compound=None,models=[],summation=0.0,duration=0.0):
self.compound = compound; self.models = models; self.summation = summation; self.duration = duration
def score(self):
return(self.summation/len(self.models))
Here's where I'm adding the instances of the class to their corresponding compounds
from sklearn.cross_validation import LeavePOut
D_compound_Models = {}
lpo = LeavePOut(len(DF_attributes.index), p=2) #All combinations of 40 index values while leaving out 2
query_compounds = ["AG1024","AG1478"]
#Check order of indices
for cpd in query_compounds:
#Create compound instance
D_compound_Models[cpd] = Models(compound=cpd)
#Get sensitivity column for compound
SR_compound = DF_compoundSensitivity[cpd]
#Create and train models
for index_values in lpo: #There should be 780 of these
#Create model
model = #Some model object
#a bunch of model training stuff that's irrelevant
accuracy = #1 or 0
#Store models
D_compound_Models[cpd].models.append(model)
D_compound_Models[cpd].summation += accuracy
This is how I'm checking it at the end
for cpd in query_compounds:
M = D_compound_Models[cpd]
print(cpd,M.summation,len(M.models),M.score())
This is the correct answer doing it the way I don't want to do it (adding list at end)
#Correct: Note how 630 and 528 are not == 780. I do some filtering so it's alright
('AG1024', 304.0, 630, 0.48253968253968255)
('AG1478', 221.0, 528, 0.4185606060606061)
This is the the incorrect answer doing it using the code from above (appending to list)
#Incorrect
('AG1024', 304.0, 1158, 0.26252158894645944)
('AG1478', 221.0, 1158, 0.19084628670120898)
I'm making sure to reset the class everytime I run in Jupyter. I've even restarted the kernel and got the same results...