1

Update

Finally I found the mistake... The bug is in my full code. I am a beginner in py, so... Thanks @mrdomoboto to point one error. Thanks @Spacedman let me to create a reproducible example, such that I can go back to look at my full code.

So sorry for the my noise... I will more carefully check my code before ask question. Should I delete it?

#

I am just doing some experiments on perceptron algorithm, and try to learn python with this practice. So I both implemented it in R and python. However I found my python code is around 10 times slower than my R code. Actually, I almost directly translate my R code into python. I really want to know what is the reason. Please point out the problem from my poor code.

R code:

perceptron <- function(X,y,ini=c(0,0,0)){
  w <- ini
  N <- length(y)
  continue <- T
  while(continue){
    cont <- 0
    for(i in 1:N){
      if(sign(sum(X[i,]*w)*y[i])==1){cont <- cont+1}
      else{w <- w + y[i]*X[i,]}
    }
    if(cont==N){continue <- F}
  }
  return(w)
}

My py code:

def Perceptron(X,y,ini=(0,0,0)):
    w = np.array(ini)
    N = X.shape[0]
    # add ones as the first columns of X
    X = np.hstack((np.ones(N).reshape(N,1), X))
    go_next = True
    while go_next:
        cont = 0
        for i in range(N):
            if np.sign(X[i,:].dot(w)*y[i]) == 1:
                cont = cont + 1
            else: w = w + y[i]*X[i,:]
            if cont==N: go_next=False
    return w
ANuo
  • 79
  • 7
  • 1
    I would recommend you profile both programs ([profiling in R](https://www.r-bloggers.com/profiling-r-code/), [profiling in Python](http://stackoverflow.com/questions/582336/how-can-you-profile-a-script)). This will provide insight into what parts of the code take the longest. As the code is quite similar, this might yield insight what parts take longer in Python. – Paul Hiemstra Feb 10 '17 at 14:27
  • if you remove this line `go_next = True` and do a `while True` loop and change this `if cont==N: go_next=False` to `if cont==N: break` should make it a couple of milliseconds quicker – WhatsThePoint Feb 10 '17 at 14:28
  • Are you sure `if cont==N: go_next=False` should be within the scope of the `for` loop in the python code? It's not in the R program. – ospahiu Feb 10 '17 at 14:29
  • @WhatsThePoint And same in R: use `repeat` instead of the `while` loop. In general both the R code and the Python code can be improved quite a bit. – Konrad Rudolph Feb 10 '17 at 14:30
  • @KonradRudolph ive never looked at R so i cant confirm – WhatsThePoint Feb 10 '17 at 14:31
  • 1
    Which version of python? The reason I ask is that `range()` should be `xrange()` in Python 2. Also, since `N` does not change, the `range` (or `xrange`) object could be created outside the `while` loop and reused. – cdarke Feb 10 '17 at 14:34
  • You should make reproducible examples that we can run - so supply some input data for each example (or code to generate some). – Spacedman Feb 10 '17 at 14:40

1 Answers1

2

Any computations inside the inner most loops of algorithms will slow programs by an order of magnitude greater than if they were inside the outer closest lexical scope.

if cont==N: go_next=False should be moved outside of the inner most for loop (as it is in your R program).

Take a look at computational analysis.

ospahiu
  • 3,465
  • 2
  • 13
  • 24
  • Hi @mrdomoboto, many thanks! But this is not the reason make python code is 10 times faster than R. There are some other error in my full code. – ANuo Feb 10 '17 at 14:57