1

I am trying to understand how LSTM RNNs work and how they can be implemented in Keras in order to be able to solve a binary classification problem. My code and the dataset i use are visible below. When i compilr the code i get an error TypeError: __init__() got multiple values for keyword argument 'input_dim', Can anybody help?

   from keras.models import Sequential
from keras.layers import LSTM
from keras.layers.embeddings import Embedding
from keras.layers import Dense
from sklearn.cross_validation import train_test_split
import numpy
from sklearn.preprocessing import StandardScaler # data normalization

seed = 7
numpy.random.seed(seed)
dataset = numpy.loadtxt("sorted output.csv", delimiter=",")
X = dataset[:,0:4]
scaler = StandardScaler(copy=True, with_mean=True, with_std=True ) #data normalization
X = scaler.fit_transform(X) #data normalization
Y = dataset[:4]
# split into 67% for train and 33% for test
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.33, random_state=seed)
# create model
model = Sequential()
model.add(Embedding(12,input_dim=4,init='uniform',activation='relu'))
model.add(Dense(4, init='uniform', activation='relu'))
model.add(LSTM(100))
model.add(Dense(1, init='uniform', activation='sigmoid'))
# Compile model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# Fit the model
model.fit(X_train, y_train, validation_data=(X_test,y_test), nb_epoch=150, batch_size=10)

enter image description here

davdis
  • 233
  • 1
  • 2
  • 14
  • this has nothing to do with neural networks, theano or keras. the only problem seems to be that `numpy.loadtxt("sorted output.csv", delimiter=",")` cannot find the file 'sorted output.csv'. are you sure it exists in the directory from whch you launch you application? also try an absolute path and if that does not help try to remove the spaces in the filename. I will only believe your "but the dataset i try to import exists" if you have the python code confirm that this file exists before the numpy function is called... – example Oct 06 '16 at 10:08
  • i am sure that my dataset exisists in the working directory, since when i try a different NN(not a recurrent one) on the same dataset it works just fine. So the only option is that my RNN network is not correctly implemented. – davdis Oct 06 '16 at 10:13
  • Your NN never sees the filename. It will thus never produce the provided error. Typically you get an stacktrace when an error occurs in python. Use it to figure out where the error happened and provide us with the line it happened at. – example Oct 06 '16 at 10:16
  • The error i get is: `TypeError: __init__() got multiple values for keyword argument 'input_dim'` – davdis Oct 06 '16 at 11:00

2 Answers2

0

Looks like two separate questions here.

Regarding how to use LSTMs / Keras, there are some good tutorials around. Try this one which also describes a binary classification problem. If you have a specific issue or area that you don't understand, let me know.

Regarding the file opening issue, perhaps the whitespace in the filename is causing an issue. Check out this answer to see if it helps.

Community
  • 1
  • 1
Rehan Ali
  • 1
  • 1
  • i am aware of the tutorial you provided and i have in fact created a lot of feedforward NNs successfully. However the problem occurs wen i try to construct a Recurrent NN(for the same dataset that i used for the feedforward ones) and specifically a LSTM NN – davdis Oct 06 '16 at 10:17
0

This is in fact a case where the error message you are getting is perfectly to-the-point. (I wish this would always be the case with Python and Keras...)

Keras' Embedding layer constructor has this signature: keras.layers.embeddings.Embedding(input_dim, output_dim, ...)

However, you are constructing it using: Embedding(12,input_dim=4,...)

So figure out which is the input and output dimension, respectively, and fix your parameter order and names. Based on the table you included in the question, I'm guessing 4 is your input dimension and 12 is your output dimension; then it'd be Embedding(input_dim=4, output_dim=12, ...).

Petr Baudis
  • 1,178
  • 8
  • 13