0

I have Python class where sklearn/scikit-learn is used to build DecisionTree. It runs under 1s if I execute in Django shell, like python manage.sh shell. Then I would load data from csv and call my class and fit function from DecisionTreeClassified.

However, if I try to incorporate the same code into Django view it takes 6 min. and 6 GB of RAM to finish.

Are the threads/multiprocess implicated?

Update: I don't think, that the code is problem, but rather Django environment or WSGI is the cause. However, here is some code:

def myfit: 
   dt=tree.DecisionTreeClassifier(criterion='gini', splitter='best', min_samples_split=2,min_samples_leaf=n*self.params.min_category,max_depth=self.p‌​arams.max_depth,max_features=None, random_state=None, min_density=None, compute_importances=None) 
   dt.fit(x_i,y) 
   start_time=time.time()
   dt.predict_proba(x_i)[:,1] #takes ages here
   print time.time() - start_time, "seconds"
   #Django - 301.682538033 seconds, Cli - 0.06 seconds
Dzidas
  • 305
  • 4
  • 13
  • it is difficult to answer your question without more precise info on your code – joaquin Sep 24 '13 at 17:40
  • 1
    Have you tried [profiling your application](http://stackoverflow.com/questions/582336/how-can-you-profile-a-python-script)? – Mark Hildreth Sep 24 '13 at 18:47
  • It's entirely possible your template is somehow horribly broken. At the very least add some `print time.time()` statements to see if it's calling `myfit()` that slows down or all the Django crud around it. – millimoose Sep 24 '13 at 18:48
  • I know where it stucks (please check excerpt of the code), but I struggle to understand why it happens only in Django. – Dzidas Sep 24 '13 at 18:55
  • I have added time profiling – Dzidas Sep 24 '13 at 20:21
  • @Dzidas Is there anything going on in a database during that time? – Izkata Sep 24 '13 at 20:22
  • If you have `DEBUG = True`, try setting it to `False` and time it again – Izkata Sep 24 '13 at 20:23
  • 1
    Are you sure your are passing the same data in the CLI and the Django API? – ogrisel Sep 24 '13 at 20:24
  • @ogrisel very good question - I wrote some mock, where same random data is passed to predict function and results are coming back instantly in Django. That suggest, that on Django side I messed up data input. – Dzidas Sep 25 '13 at 12:42

0 Answers0