I have Python class where sklearn/scikit-learn is used to build DecisionTree. It runs under 1s if I execute in Django shell, like python manage.sh shell
. Then I would load data from csv and call my class and fit
function from DecisionTreeClassified.
However, if I try to incorporate the same code into Django view it takes 6 min. and 6 GB of RAM to finish.
Are the threads/multiprocess implicated?
Update: I don't think, that the code is problem, but rather Django environment or WSGI is the cause. However, here is some code:
def myfit:
dt=tree.DecisionTreeClassifier(criterion='gini', splitter='best', min_samples_split=2,min_samples_leaf=n*self.params.min_category,max_depth=self.params.max_depth,max_features=None, random_state=None, min_density=None, compute_importances=None)
dt.fit(x_i,y)
start_time=time.time()
dt.predict_proba(x_i)[:,1] #takes ages here
print time.time() - start_time, "seconds"
#Django - 301.682538033 seconds, Cli - 0.06 seconds