0

I am training a neural network with keras in python to predict whether 2 given phrases are paraphrases of each other but am getting some very strange behaviour.

when I run something like this (semi pseudo code):

for _ in range(10):
    predictions = train_and_predict_nn(features, test_size)
    print(predictions)

I don't change any of the input parameters, only train and evaluate the neural network several times. Now when I look at the predictions for the test set, most of the time it only predicts one class. Sometimes though it predicts 2 classes (as it should). I am very confused by this behaviour, since I don't change any of the input parameters and still get such high fluctuations. The dataset is roughly balanced with about 60% of instances having the label 1 and the rest having the label 0. The dataset has roughly 10k instances. Can anybody explain this strange behaviour?

Edit: I'll try to include some more information. This is the method I use to train the neural network and to classify the test set.

def classify(feature_selection, test_size, feature_file_name):

    features, labels = parse(feature_file_name)
    X_train, 
    X_test, 
    y_train, 
    y_test = get_features(features, labels, feature_selection, test_size)


    num_features = len(feature_selection)
    num_epochs = math.floor(len(X_train)/20)

    model = Sequential()
    model.add(Dense(num_features, input_dim=num_features, init='uniform', activation='relu'))
    model.add(Dense(3, init='uniform', activation='relu'))
    model.add(Dense(1, init='uniform', activation='sigmoid'))

    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

    model.fit(X_train, y_train, epochs=num_epochs, batch_size=20, verbose=0)

    #loss and accuracy
    score = model.evaluate(X_test, y_test)
    predictions = model.predict(X_train)
    rounded = [round(x[0]) for x in predictions]
    return model, score, rounded

The input features are numeric. Printing X_train gives me something like this:

[[0.2041 1. 0.0909 0.1667 0.]
....
[0.1     1. 0.     0.6972 0.]]

And for numpy.shape(X_train) I get (9800, 5)

imc
  • 952
  • 2
  • 8
  • 20
  • 2
    We would have to see a great deal more about your network to tell you anything useful. – user3483203 Apr 20 '18 at 18:44
  • The problem is that I'm working with a pretty big architecture and can't really put all that code here. I was hoping someone might know this to be a general problem. – imc Apr 20 '18 at 18:47
  • 1
    there could be a dozen possible problems, we can't narrow it down without knowing details about your NN, and your data – Mohammad Athar Apr 20 '18 at 18:49

0 Answers0