I'm trying to build a tensorflow neural network using a sigmoid activation hidden layer and a softmax output layer with 3 classes. The outputs are mostly very bad and I believe it is because I am making a mistake in my model construction because I've built a similar model with Matlab and the results have been good. The data is normalized. These results look like this:
[9.2164397e-01 1.6932052e-03 7.6662831e-02]
[3.4100169e-01 2.2419590e-01 4.3480241e-01]
[2.3466848e-06 1.3276369e-04 9.9986482e-01]
[6.5199631e-01 3.4800139e-01 2.3596617e-06]
[9.9879754e-01 9.0103465e-05 1.1123115e-03]
[6.5749985e-01 2.8860433e-02 3.1363973e-01]
My nn looks like this:
def multilayer_perceptron(x, weights, biases, keep_prob):
layer_1 = tf.add(tf.matmul(x, weights['h1']), biases['b1'])
layer_1 = tf.nn.sigmoid(layer_1)
layer_1 = tf.nn.dropout(layer_1, keep_prob)
out_layer = tf.nn.softmax(tf.add(tf.matmul(layer_1,weights['out']),biases['out']))
return out_layer
With the following cost function:
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=predictions, labels=y))
I'm growing convinced that my implementation is incorrect and I am doing something very silly. Hours on google and looking at other examples hasn't helped.
UPDATE: When I changed the cost function (shown below), I get decent results. This feels wrong though.
cost = tf.losses.mean_squared_error(predictions=predictions, labels=y)