2

I used this Keras word-level text generation example as a base for my own work.

As I tried to train data on it, I got a MemoryError in line 59:

X = np.zeros((len(sentences), maxlen, len(words)), dtype=np.bool)

as a big matrix is initialized with zeros. The same zero initialization will happen to y:

y = np.zeros((len(sentences), len(words)), dtype=np.bool)

Later on the matrices will be filled like that (one hot encoding):

for i, sentence in enumerate(sentences):
    for t, word in enumerate(sentence.split()):
        X[i, t, word_indices[word]] = 1
    y[i, word_indices[next_words[i]]] = 1


How can I use sparse matrices to save memory and make it executable?

I looked at scipy's sparse matrix, but it looked like they only support 2D matrices.
I also looked at Tensorflow's sparse tensor but I think they don't support zero initialization to fill them in later (like I need).

AIpeter
  • 868
  • 1
  • 7
  • 17
  • If you are using for loop then its not vectorizing. Try will scikit `HashVectorizer` to get the sparse matrix. – Bharath M Shetty Oct 24 '17 at 11:50
  • Use a reshaped 2D sparse matrix instead? – Divakar Oct 24 '17 at 12:10
  • @Bharathshetty How can I use the scikit `HashingVectorizer` to create a matrix filled with zeros? I don't see this capability in the [docs](http://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.HashingVectorizer.html#sklearn.feature_extraction.text.HashingVectorizer.transform). – AIpeter Oct 24 '17 at 12:55
  • Am late to the party by it's similar to this question: https://stackoverflow.com/questions/41538692/using-sparse-matrices-with-keras-and-tensorflow – Edward Gaere Jun 08 '21 at 16:34

0 Answers0