6

I am trying to use a PlayStation Eye Camera for a deep reinforcement learning project. The network, TensorFlow installation (0.11) and CUDA (8.0) are functional because I have been able to train the network on a simulation.

Now when I am trying to read in images from the real camera the network code crashes with the error below. Is there a mistake in my OpenCV installation (3.2.0) or is there another problem? I would be eternally grateful because I have not been finding any information about this problem.

E tensorflow/stream_executor/cuda/cuda_blas.cc:367] failed to create      cublas handle: CUBLAS_STATUS_NOT_INITIALIZED
W tensorflow/stream_executor/stream.cc:1390] attempting to perform BLAS operation using StreamExecutor without BLAS support


Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
    self.run()
  File "/usr/lib/python2.7/threading.py", line 754, in run
    self.__target(*self.__args, **self.__kwargs)
  File "main.py", line 48, in worker
    action = dqn.getAction()
  File "../network/evaluation.py", line 141, in getAction
    Q_value = self.Q_value.eval(feed_dict= {self.input_state:[self.currentState]})[0]
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 559, in eval
    return _eval_using_default_session(self, feed_dict, self.graph, session)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 3761, in _eval_using_default_session
    return session.run(tensors, feed_dict)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 717, in run
    run_metadata_ptr)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 915, in _run
    feed_dict_string, options, run_metadata)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 965, in _do_run
    target_list, options, run_metadata)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 985, in _do_call
    raise type(e)(node_def, op, message)
InternalError: Blas SGEMM launch failed : a.shape=(1, 1600), b.shape=(1600, 4), m=1, n=4, k=1600
     [[Node: MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/gpu:0"](Reshape, Variable_6/read)]]

Relevant code from the camera class:

# OpenCV
import numpy as np
import cv2

# scipy
from scipy.misc import imresize

# Time
from time import time

# Clean exit
import sys
import os

# Max value for the gray values
MAX_GRAY = 255.0
INPUT_SIZE = 75

class Camera:


    # Initialization method
    def __init__(self, duration, exchanger, framesPerAction = 10, width = 640, height = 480, show = True):

        # Create the video capture
        self.cap = cv2.VideoCapture(1)

        # Set the parameters of the capture
        self.cap.set(3, width)
        self.cap.set(4, height)
        self.cap.set(5, 30)

        # Get the properties of the capture
        self.width = int(self.cap.get(3))
        self.height = int(self.cap.get(4))
        self.fps = int(self.cap.get(5))

        # Print these properties
        print 'Width:', self.width, '| Height:', self.height, '| FPS:', self.fps

        # Duration that the camera should be running
        self.duration = duration

        # Number of frames that should be between every extracted frame
        self.framesPerAction = framesPerAction

        # Exchanges the frames with the network
        self.exchanger = exchanger

        # Display the frames on the monitor
        self.show = show

        # Counter for the number of frames since the last action
        self.frameCounter = 0

    # Starts the loop for the camera
    def run(self):

        startTime = time()

        # Loop for a certain time
        while(self.duration > time() - startTime):

            # Check frames per second
#           print 'Start of this frame', time()-startTime

            # Capture frame-by-frame
            ret, frame = self.cap.read()

            # Close when user types ESCAPE(27)
            if cv2.waitKey(1) & 0xFF == 27:
                break

            # Increment framecounter
            if(self.frameCounter != self.framesPerAction):
                self.frameCounter += 1

            # Extract the resulting frame
            else:

                # Crop to square
                step = int((640 - 480) / 2)
                result = frame[0 : 480, step : step + 480]

                # Downsample the image
#               result = cv2.resize(gray, (75, 75))
                result = imresize(result, size=(75, 75, 3))

                # Transform to grayscale
#               gray = cv2.cvtColor(input, cv2.COLOR_BGR2GRAY)
                result = self.rgb2gray(result)

                # Change range of image from [0,255] --> [0, 1]
                result = result / 255.0

                # Store the frame on the exchanger
                self.exchanger.store(0, False, result)

                # reset framecounter
                self.frameCounter = 0

            # Display the frame on the monitor
            if(self.show):
                    cv2.imshow('frame', frame)

        # When everything done, release the capture
        self.cap.release()
        cv2.destroyAllWindows()

        # Exit so that the network thread also stops running
        os._exit(0)
talonmies
  • 70,661
  • 34
  • 192
  • 269
RandomEngineer
  • 61
  • 1
  • 1
  • 2
  • Possible duplicate of [Tensorflow crashes with CUBLAS\_STATUS\_ALLOC\_FAILED](http://stackoverflow.com/questions/41117740/tensorflow-crashes-with-cublas-status-alloc-failed) – talonmies Feb 27 '17 at 16:38
  • Do you still get the issue on TensorFlow 1.0? – Neal Feb 28 '17 at 01:26
  • @talonmies that fix does not work here. I think the problem has to do with the network having trouble using cublas because opencv is already using it. – RandomEngineer Feb 28 '17 at 09:36
  • @Neal Thanks, this actually fixed the issue. Was not planning on updating because of the backwards compatibility problems but this was actually easier to do than expected. Thanks for the tip – RandomEngineer Feb 28 '17 at 10:05
  • Awesome! For future reference, if you need to upgrade code from an older version of TensorFlow to 1.0, you can use this upgrade script: https://github.com/tensorflow/tensorflow/tree/master/tensorflow/tools/compatibility – Neal Feb 28 '17 at 19:14

3 Answers3

7

Maybe the following command helps:

sudo rm -rf .nv/

Good luck.

Haozhe Xie
  • 3,438
  • 7
  • 27
  • 53
5

I had this problem but with a simpler solution. My issue was that I was running the script in a command line and the Idle shell of that same script was open at the same time. Closing the shell solved the issue.

2

Apparently this error can have a variety of causes. I solved this issue by following this issue on the official repo. The PyPi build of Tensorflow GPU 2.2 uses CUDA 10.1 and libcublas 10.2.1.243, but I had cublas 10.2.2.89 installed. To solve it:

Centos:

yum remove libcublas
yum install libcublas10-10.2.1.243-1.x86_64

Ubuntu:

sudo apt remove libcublas10
sudo apt install libcublas10=10.2.1.243-1

Then I removed the nvidia cache:

rm -rf ~/.nv/

And it worked.

Long story short, nvidia's closed-source policy has created a labyrinth of version mismatches where you have to either build your tensorflow distribution with your own version of CUDA, cudnn and cublas, which is not as easy as it sounds, or make sure you have exactly the right version for all of them installed, which again, because of nvidia's little to no cooperation with the Linux foundation and open-source projects is not as easy as it could be.

Iman Akbari
  • 2,167
  • 26
  • 31