2

I'm not sure how useful this really is for a regression task but it would be quite nice to see how well my algorithm has learnt the training set.

I found plotting for a 2D problem quite simple and yet I'm having trouble plotting in 3D.

import numpy as np
import itertools
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D


def gradient_descent(x, y, w, lr, m, iter):
    xTrans = x.transpose()
    for i in range(iter):
        prediction = np.dot(x, w)
        loss = prediction - y
        cost = np.sum(loss ** 2) / m

        print("Iteration %d | Cost: %f" % (i + 1, cost))

        gradient = np.dot(xTrans, loss) / m     # avg gradient

        w -= lr * gradient   # update the weight vector

    return w

it = np.ones(shape=(100, 3))    # it[:,2] will be set of bias values
x = np.arange(1, 200, 20)
d = np.random.uniform(-100, 100, 100)

m, n = np.shape(it)

# initialise weights to 0
w = np.zeros(n)

# produce a 100x2 containing every combination of x
indx = 0
for a, b in itertools.product(x, x):
    it[indx][0] = a
    it[indx][1] = b
    indx += 1

# function with uniform distribution
y = .4*it[:,0] + 1.4*it[:,1] + d

iter = 1500             # number of iterations
lr = 0.00001            # learning rate / alpha

trained_w = gradient_descent(it, y, w, lr, m, iter)
result = trained_w[0] * it[:,0] + trained_w[1] * it[:,1] + trained_w[2]  # linear plot of our predicted function
print("Final weights: %f | %f | %f" % (trained_w[0], trained_w[1], trained_w[2]))

# scatter of data set + trained function hyperplane
plt3d = plt.figure().gca(projection='3d')
plt3d.hold(True)
plt3d.scatter(it[:,0], it[:,1], y)
x_surf, y_surf = np.meshgrid(it[:,0], it[:,1])
plt3d.plot_surface(x_surf, y_surf, result)

plt.show()

My result of the plot is a little odd:

enter image description here

Luke Vincent
  • 1,266
  • 2
  • 19
  • 35

1 Answers1

2

The problem is that you're mixing up dimensions in your plot.

You start by constructing a mesh from (x,x). If I understand correctly, you could've done the same with meshgrid:

x = np.arange(1, 200, 20)
N = x.size
a,b = np.meshrid(x,x)
it = np.array([b.ravel(),a.ravel(),np.ones(N)])

Then you do the training on your random input data, and obtain the function result which is an array of length N^2 and contains a single point for each coordinate pair in it (which is of shape (N^2,3)). Finally, you generate a new mesh from it and try to plot a surface using those coordinates and result.

So you're passing two coordinates, x_surf and y_surf of shape (N^2,N^2) and a surface value result of shape (N^2,) to plot_surf. The only reason this works is that probably matplotlib uses array broadcasting to interpret the input, and replicates the smaller number of values to the larger number of coordinates.

So you have two choices. Either use the input grid to plot your surface with, or if for some reason you need to use a different mesh (to choose a larger/smaller domain for plotting the surface, for instance), use bilinear interpolation/extrapolation to recompute your surface. OK, this is a very fancy way of saying

result = trained_w[0] * x_surf + trained_w[1] * y_surf + trained_w[2]
# instead of
# result = trained_w[0] * it[:,0] + trained_w[1] * it[:,1] + trained_w[2]

If you stick with your original mesh, then you have to np.reshape it back to shape (N,N) to make plot_surface happy; in case of the above approach your result will already have the proper shape for plotting. Using the former approach:

plt3d = plt.figure().gca(projection='3d')
plt3d.hold(True)
plt3d.scatter(it[:,0], it[:,1], y)
plt3d.plot_surface(np.reshape(it[:,0],(N,N)), np.reshape(it[:,1],(N,N)), np.reshape(result,(N,N)))
# where N = x.size still

Result:

plot

Note that I would call this object simply a plane: "3d hyperplane" might be over-selling it a bit.