4

I'm attempting to calculate the decision_function of a SVC classifier MANUALLY (as opposed to using the inbuilt method) using the the python library SKLearn.

I've tried several methods, however, I can only ever get the manual calculation to match when I don't scale my data.

z is a test datum (that's been scaled) and I think the other variables speak for themselves (also, I'm using an rbf kernel if thats not obvious from the code).

Here are the methods that I've tried:

1 Looping method:

dec_func = 0
for j in range(np.shape(sup_vecs)[0]):

    norm2 = np.linalg.norm(sup_vecs[j, :] - z)**2 
    dec_func = dec_func + dual_coefs[0, j] * np.exp(-gamma*norm2)

dec_func += intercept

2 Vectorized Method

diff = sup_vecs - z
norm2 = np.sum(np.sqrt(diff*diff), 1)**2
dec_func = dual_coefs.dot(np.exp(-gamma_params*norm2)) + intercept

However, neither of these ever returns the same value as decision_function. I think it may have something to do with rescaling my values or more likely its something silly that I've been over looking!

Any help would be appreciated.

precicely
  • 511
  • 6
  • 17
  • Have you tried with ``kernel=precomputed`` and passing in the kernel you computed yourself? – Andreas Mueller Feb 14 '15 at 22:48
  • @AndreasMueller, I've used the 'off the shelf' rbf kernel contained within the SVC class with C = 1000 and gamma = 0.01. After training, i call `clf.decision_function(z)` and get a value, this value however, never matches the value I produce when I perform the calculation by hand as demonstrated above... I'm wondering if there's something wrong with my maths or is there a bug in libsvm? – precicely Feb 15 '15 at 00:48
  • Yeah I am wondering that too, and I would love to get confirmation on that. What I was proposing was to use a precomputed kernel, as that would make sure that the computation of the kernel inside libsvm is the same as the one that you do. Someone reported a similar issue earlier and I am afraid of having introduced a sign error somewhere. – Andreas Mueller Feb 17 '15 at 23:11
  • `np.sqrt(diff*diff)` is this a typo or a fancy way to calculate `abs` of `diff`? – Artem Sobolev Feb 18 '15 at 10:51
  • @AndreasMueller, as always, it was a stupid mistake on my part, please see my answer below. However, I am a bit perplexed by the sign change that's required to get the answers to match? Any idea why this is the case? @Barmaley.exe `np.sqrt(diff*diff)` was just an attempt to make sure I wasn't misunderstanding any and all of Pythons/numpy's mathematical implementations (I've come from a Matlab background and hence I'm still getting my head around Python and numpy's nuances). See my answer below for my final implementations. – precicely Feb 18 '15 at 18:03
  • @user1182556, I'm not sure I'm following: why do sup_vecs and z have the same size? – Riley Aug 17 '18 at 06:47
  • 1
    @AndreasMueller, I'm afraid I don't follow: in the calculation of diff, why do z and sup_vecs always have the same size? They could be two very different things, no? – Riley Aug 17 '18 at 06:50

1 Answers1

7

So after a bit more digging and head scratching, I've figured it out.

As I mentioned above z is a test datum that's been scaled. To scale it I had to extract .mean_ and .std_ attributes from the preprocessing.StandardScaler() object (after calling .fit() on my training data of course).

I was then using this scaled z as an input to both my manual calculations and to the inbuilt function. However the inbuilt function was a part of a pipeline which already had StandardScaler as its first 'pipe' in the pipeline and as a result z was getting scaled twice! Hence, when I removed scaling from my pipeline, the manual answers "matched" the inbuilt function's answer.

I say "matched" in quotes by the way as I found I always had to flip the sign of my manual calculations to match the inbuilt version. Currently I have no idea why this is the case.

To conclude, I misunderstood how pipelines worked.

For those that are interested, here's the final versions of my manual methods:

diff = sup_vecs - z_scaled
# Looping Method
dec_func_loop = 0
for j in range(np.shape(sup_vecs)[0]):
    norm2 = np.linalg.norm(diff[j,:]) 
    dec_func_loop = dec_func_loop + dual_coefs[j] * np.exp(-gamma*(norm2**2))

dec_func_loop = -1 * (dec_func_loop - intercept)

# Vectorized method
norm2 = np.array([np.linalg.norm(diff[n, :]) for n in range(np.shape(sup_vecs)[0])])
dec_func_vec = -1 * (dual_coefs.dot(np.exp(-gamma*(norm2**2))) - intercept)

Addendum

For those who are interested in implementing a manual method for a multiclass SVC, the following link is helpful: https://stackoverflow.com/a/27752709/1182556

Community
  • 1
  • 1
precicely
  • 511
  • 6
  • 17
  • 1
    The sign is flipped because `sklearn` supports all kind of targets (strings, for example) and internally maps them in ascending order. That means the smaller value (in case of binary classification) like `y=0` or `y=-1` will be mapped to `+1` class, and the bigger one like `y=1` — to the `-1` class. This, essentially, flips the sign of dual coefficients. This is also the reason the intercept has a minus in the documentation. – Artem Sobolev Feb 25 '15 at 22:22
  • 1
    Please be aware that this was considered [a bug](https://github.com/scikit-learn/scikit-learn/pull/4326), so your code will break on sklearn 0.16+. The fix introduced a sanity check test, so you can find a code to reproduce `decision_function`. Basically, the signs of `dual_coef_` and `intercept_` were inverted. – Artem Sobolev Mar 12 '15 at 18:43