Learning Perceptorn can be easily accomplished using the update rule w_i=w_i + n(y-\hat{y})x.
All resources I read so far say that the learning rate n can be set to 1 w.l.g.
My question is the following, is there any proof that the Speed of convergence will always be the same, given that the data is linearly separable? Should not this also depend of the initial w vector?