Neural Network In Scikit-Learn not producing meaningful results

Question

I'm currently trying to use the scikit learn package for its neural network functionality. I have a complex problem to solve with it, but to start out I am just trying a couple of basic tests to familiarize myself with it. I have gotten it to do something, but it isn't producing meaningful results. My code:

import sklearn.neural_network.multilayer_perceptron as nnet
import numpy
def generateTargetDataset(expression="%s", generateRange=(-100,100), s=1000):
    expression = expression.replace("x", "%s")    
    x = numpy.random.rand(s,)
    y = numpy.zeros((s,), dtype="float")
    numpy.multiply(x, abs(generateRange[1]-generateRange[0]), x)
    numpy.subtract(x, min(generateRange), x)
    for z in range(0, numpy.size(x)):
        y[z] = eval(expression % (x[z]))
    x = x.reshape(-1, 1)
    outTuple = (x, y)
    return(outTuple)
print("New Net + Training")
QuadRegressor = nnet.MLPRegressor(hidden_layer_sizes=(10), warm_start=True, verbose=True, learning_rate_init=0.00001, max_iter=10000, algorithm="sgd", tol=0.000001)
data = generateTargetDataset(expression="x**2", s=10000, generateRange=(-1,1))
QuadRegressor.fit(data[0], data[1])
print("Net Trained")
xt = numpy.random.rand(10000, 1)
yr = QuadRegressor.predict(xt)
yr = yr.reshape(-1, 1)
xt = xt.reshape(-1, 1)
numpy.multiply(xt, 100, xt)
numpy.multiply(yr, 10000, yr)
numpy.around(yr, 2, out=yr)
numpy.around(xt, 2, out=xt)
out = numpy.concatenate((xt, yr), axis=1)
numpy.set_printoptions(precision=4)
numpy.savetxt(fname="C:\\SCRATCHDIR\\numpydump.csv", X=out, delimiter=",")

I don't understand how to post the data it gives me, but it spits out between 7000 and 10000 for all inputs between 0 and 100. It seems to be correctly mapped very close to the top of the range, but for inputs close to 0, it just returns something near 7000.

EDIT: I forgot to add this. The network has the same behavior if I remove the dummy training to y=x, but I read somewhere that sometimes you can help a network along by training it to a different but closer function and then using that already weighted network as a starting ground. It didn't work but I just hadn't taken that bit out yet.

Have you normalised your input data and how are you initialising your weights? — tttthomasssss, Aug 03 '16 at 07:57
There is only a single input to this network, ranging from -100 to 100. I guess I could try normalizing it to -1 to 1 and then multiplying by 10,000 at the end. I have tried initializing the weights randomly, using the built in functionality in scikit-learn, as well as using a warm start from a network trained to regress y=x. — Henry Prickett-Morgan, Aug 03 '16 at 16:22
I just tried normalizing my input to -1,1 and lowered my learning rate tenfold. I also set the tolerance for convergence much lower because the error multiply now. The network trains about 60 times faster now, but it hasn't actually increased its performance, because all the predicted datapoints roughly follow the equation f(x) = 240x-17000, instead of x^2 like they should — Henry Prickett-Morgan, Aug 03 '16 at 16:57

score 2 · Accepted Answer · edited May 23 '17 at 12:14

My recommendation is to reduce the number of neurons per layer, and increase the training dataset size. Right now, you have a lot of parameters to train in your network, and a small training set (~10K). However, the main point of my answer is that sklearn probably isn't a great choice for your end application.

So you have a complex problem you want to solve with neural networks?

I have a complex problem to solve with it, but to start out I am just trying a couple of basic tests to familiarize myself with it.

According to the official user guide, sklearn's implementation of neural networks isn't designed for large applications and is a lot less flexible than other options for deep learning.

One Python deep learning library I've had good experiences with is keras, a modular, easy-to-use library with GPU support.

Here's a sample I coded up that trains a single perceptron to do quadratic regression.

from keras.models import Sequential
from keras.layers.core import Dense, Activation
from keras.optimizers import SGD
import numpy as np
import matplotlib.pyplot as plt

model = Sequential()
model.add(Dense(1, init = 'uniform', input_dim=1))
model.add(Activation('sigmoid'))


model.compile(optimizer = SGD(lr=0.02, decay=1e-6, momentum=0.9, nesterov=True), loss = 'mse')

data = np.random.random(1000)
labels = data**2

model.fit(data.reshape((len(data),1)), labels, nb_epoch = 1000, batch_size = 128, verbose = 1)

tdata = np.sort(np.random.random(100))
tlabels = tdata**2

preds = model.predict(tdata.reshape((len(tdata), 1)))

plt.plot(tdata, tlabels)
plt.scatter(tdata, preds)
plt.show()

This outputs a scatter plot of the test data points, along with a plot of the true curve.

As you can see, the results are reasonable. In general, neural networks are hard to train, and I had to do some parameter tuning before I got this example working.

It looks like you're using Windows. This question may be helpful for installing Keras on Windows.

Awarded bounty for being the only person who took the time to help. Thank you and I will look at this later. — Henry Prickett-Morgan, Aug 14 '16 at 19:59

Neural Network In Scikit-Learn not producing meaningful results

1 Answers1