0

For some input X, e.g.:

[[ 1.456044 -7.058824]
 [-4.478022 -2.072829]
 [-7.664835 -6.890756]
 [-5.137363  2.352941]
 ...

And Y, e.g.:

[ 1.  1.  1.  -1.  ...

Here is my perceptron training function:

def train(self, X, Y, iterations=1000):
    # Add biases to every sample.
    biases = np.ones(X.shape[0])
    X = np.vstack((biases, X.T)).T
    w = np.random.randn(X.shape[1])
    errors = []

    for _ in range(iterations):
        all_corr = True
        num_err = 0
        for x, y in zip(X, Y):
            correct = np.dot(w, x) * y > 0
            if not correct:
                num_err += 1
                all_corr = False
                w += y * x
        errors.append(num_err)
        # Exit early if all samples are correctly classified.
        if all_corr:
            break

    self.w = perpendicular(w[1:])
    self.b = w[0]
    return self.w, self.b, errors

When I print the errors, I typically see something like:

[28, 12, 10, 7, 10, 8, 11, 8, 0]

Notice that I am getting 0 error, but the data is clearly off by some bias:

Samples vs. Accuracy

For example, here is b for one run:

-28.6778508366

I have looked at this SO but do not see a difference in our algorithm. I think maybe it is how I am interpreting and then plotting w and b? I just do something very simple:

def plot(X, Y, w, b):
    area = 20
    fig = plt.figure()
    ax = fig.add_subplot(111)
    p = X[Y == 1]
    n = X[Y == -1]
    ax.scatter(p[:, 0], p[:, 1], s=area, c='r', marker="o", label='pos')
    ax.scatter(n[:, 0], n[:, 1], s=area, c='b', marker="s", label='neg')
    neg_w = -w
    xs = [neg_w[0], w[0]]
    ys = [neg_w[1], w[1]]  # My guess is that this is where the bias goes?
    ax.plot(xs, ys, 'r--', label='hyperplane')
    ...
kloffy
  • 2,928
  • 2
  • 25
  • 34
jds
  • 7,910
  • 11
  • 63
  • 101

1 Answers1

1

Yes, I think you learned the correct w but did not plot the separating line correctly.

You have a dataset in 2d. So your w have 2 dimensions. Let's say, w = [w1, w2].

The separating line should be w1 x x1 + w2 x x2 + b = 0. I think you are using two points on that line to draw the separating line. The two points can be found below:

  • First, let's set x1 to 0. we get x2 = -b/w2.
  • Second, let's set x2 to 0. we get x1 = -b/w1.

Thus the two points should be (0, -b/w2) and (-b/w1, 0). In your formula of xs and ys, I did not see how b is used. Could you try setting:

# Note w[0] = w1, w[1] = w2. 
xs = [0, -b/w[0]]   # x-coordinate of the two points on line.
ys = [-b/w[1], 0]   # y-coordinate.

See below graph taken from this slides mentioned by @gwg. The red solid line is the separator you learned through w (not self.w). The red dotted arrow indicats that: on that side of the separator, the sign of sum(wx) > 0. It's also useful in margin-based models (perceptron is such a model) to calculate the margin of your learned model. That is, if you start from the separator, and take the direction of the perpendicular of the separator, the first example you reach defines the "margin" on that side, which is the distance you travel so far (note you can start from anywhere on the separator).

snapshot from http://www.cs.princeton.edu/courses/archive/fall16/cos402/lectures/402-lec4.pdf

greeness
  • 15,956
  • 5
  • 50
  • 80
  • This works, but only if I do not take the perpendicular of `w`—which is not what I expected. I thought that the learned `w` was orthogonal to the actual hyperplane. That's why taking the dot product between `w` and a point `x` works to classify it. I was trying to adapt this algorithm (slide 25): http://www.cs.princeton.edu/courses/archive/fall16/cos402/lectures/402-lec4.pdf. But I think I am performing it incorrectly. – jds Nov 02 '16 at 14:45
  • Yes, I did not notice you took the perpendicular (`self.w = perpendicular(w[1:])`). I am not aware of why you want to do that. Apparently the weight itself describes the linear separator. We don't want to use the perpendicular of the line instead. You might get confused with something from the slides (the sold red line in the slides is the separator, not the red dotted line with arrow). The red dotted line indicates that on that side of the separator, the sign of sum(wx) > 0. – greeness Nov 02 '16 at 21:12