I ran vowpal wabbit on a full dataset and got the final coefficients. And I ran the same data with batch learning ( glm in R ) method to get the coefficients. The coefficients I got from vowpal wabbit is hugely different from the batch learning coefficients.
I was under the impression that vowpal wabbit uses gradient descent algorithm for any given model ( squared loss, logistic loss ). So I expected that the end results would match to some extent. But one is in the order 10^-1 ( online ) and other is 10^4 ( batch ). Could someone please explain the difference? I even used multiple passes ( the same number of iterations batch learning used )
========================
some info: in glm I used binomial family and in vowpal wabbit I used loss_function logistic.