Tensorflow: Blas GEMM launch failed: a.shape=(2, 128), b.shape=(128, 44), m=2, n=44, k=128

Question

I am trying to run the classifier on the generated set of embedding. Error is as follows:

Epoch 1/2
---------------------------------------------------------------------------
InternalError                             Traceback (most recent call last)
<ipython-input-11-d859ddc5fde5> in <module>
      2 y = np.zeros((2, ))
      3 
----> 4 model.fit(x, y, epochs=2)

~/.virtualenvs/pan_demo/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, max_queue_size, workers, use_multiprocessing, **kwargs)
    878           initial_epoch=initial_epoch,
    879           steps_per_epoch=steps_per_epoch,
--> 880           validation_steps=validation_steps)
    881 
    882   def evaluate(self,

~/.virtualenvs/pan_demo/lib/python3.6/site-packages/tensorflow/python/keras/engine/training_arrays.py in model_iteration(model, inputs, targets, sample_weights, batch_size, epochs, verbose, callbacks, val_inputs, val_targets, val_sample_weights, shuffle, initial_epoch, steps_per_epoch, validation_steps, mode, validation_in_fit, **kwargs)
    327 
    328         # Get outputs.
--> 329         batch_outs = f(ins_batch)
    330         if not isinstance(batch_outs, list):
    331           batch_outs = [batch_outs]

~/.virtualenvs/pan_demo/lib/python3.6/site-packages/tensorflow/python/keras/backend.py in __call__(self, inputs)
   3074 
   3075     fetched = self._callable_fn(*array_vals,
-> 3076                                 run_metadata=self.run_metadata)
   3077     self._call_fetch_callbacks(fetched[-len(self._fetches):])
   3078     return nest.pack_sequence_as(self._outputs_structure,

~/.virtualenvs/pan_demo/lib/python3.6/site-packages/tensorflow/python/client/session.py in __call__(self, *args, **kwargs)
   1437           ret = tf_session.TF_SessionRunCallable(
   1438               self._session._session, self._handle, args, status,
-> 1439               run_metadata_ptr)
   1440         if run_metadata:
   1441           proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

~/.virtualenvs/pan_demo/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py in __exit__(self, type_arg, value_arg, traceback_arg)
    526             None, None,
    527             compat.as_text(c_api.TF_Message(self.status.status)),
--> 528             c_api.TF_GetCode(self.status.status))
    529     # Delete the underlying status object from memory otherwise it stays alive
    530     # as there is a reference to status from this from the traceback due to

InternalError: Blas GEMM launch failed : a.shape=(2, 128), b.shape=(128, 44), m=2, n=44, k=128
     [[{{node layer_1/MatMul}}]]
     [[{{node loss_1/output_loss/broadcast_weights/assert_broadcastable/is_valid_shape/has_valid_nonscalar_shape/has_invalid_dims/concat}}]]

Model looks like this:

model = tf.keras.Sequential([
    tf.keras.layers.Dense(44, activation=tf.nn.relu, name='layer_1', input_shape=(128, )),
#     tf.keras.layers.Dropout(0.3, name='layer_2'),
    tf.keras.layers.Dense(num_of_employees, activation=tf.nn.softmax, name='output')
])

opt = tf.keras.optimizers.SGD(lr=0.34)
model.compile(optimizer=opt, loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.build()
x = np.zeros((2, 128))
y = np.zeros((2, ))

model.fit(x, y, epochs=2)

Also, it would be really handy if one can guide how to read the last part of stack trace, where it mentions node loss_1/output_loss...

score 0 · Accepted Answer · answered Oct 17 '19 at 09:05

short answer:

config = tensorflow.ConfigProto()
config.gpu_options.allow_growth = True
print('############## Allowing Growth ###########')
session = tf.Session(config=config)

# -------------------  start importing keras module ---------------------
import keras.backend.tensorflow_backend as K
from keras.backend.tensorflow_backend import set_session
set_session(tf.Session(config=config))

as mentioned here.

Jupyter Notebook's log suggested that failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED, for which above fix was required.

sounds like you got it working, but in hopes of saving people some time... I was using CUDA 10.2 when I saw the "blas GEMM Launch Failed" error. Ended up downgrading to CUDA 10.0 and it worked. (on tensorflow 2.0 and cudnn 7.6.5) — iatechicken, Nov 30 '19 at 10:12

Tensorflow: Blas GEMM launch failed: a.shape=(2, 128), b.shape=(128, 44), m=2, n=44, k=128

1 Answers1