3

I installed anaconda3 in my ubuntu AWS case. I finishe the creation of the environment:

conda create -n tensorflow-gpu

I used the high level command

conda install tensorflow-gpu

This works on my own computer but not on AWS. It installed all relavent packages and didn't yield any errors. But when I import tensorflow, it gives me the error:

Traceback (most recent call last):
  File "/home/ubuntu/.conda/envs/tensorflow-gpu/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/home/ubuntu/.conda/envs/tensorflow-gpu/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/home/ubuntu/.conda/envs/tensorflow-gpu/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "/home/ubuntu/.conda/envs/tensorflow-gpu/lib/python3.6/imp.py", line 243, in load_module
    return load_dynamic(name, filename, file)
  File "/home/ubuntu/.conda/envs/tensorflow-gpu/lib/python3.6/imp.py", line 343, in load_dynamic
    return _load(spec)
ImportError: libcuda.so.1: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ubuntu/.conda/envs/tensorflow-gpu/lib/python3.6/site-packages/tensorflow/__init__.py", line 24, in <module>
    from tensorflow.python import *
  File "/home/ubuntu/.conda/envs/tensorflow-gpu/lib/python3.6/site-packages/tensorflow/python/__init__.py", line 49, in <module>
    from tensorflow.python import pywrap_tensorflow
  File "/home/ubuntu/.conda/envs/tensorflow-gpu/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py", line 72, in <module>
    raise ImportError(msg)
ImportError: Traceback (most recent call last):
  File "/home/ubuntu/.conda/envs/tensorflow-gpu/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/home/ubuntu/.conda/envs/tensorflow-gpu/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/home/ubuntu/.conda/envs/tensorflow-gpu/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "/home/ubuntu/.conda/envs/tensorflow-gpu/lib/python3.6/imp.py", line 243, in load_module
    return load_dynamic(name, filename, file)
  File "/home/ubuntu/.conda/envs/tensorflow-gpu/lib/python3.6/imp.py", line 343, in load_dynamic
    return _load(spec)
ImportError: libcuda.so.1: cannot open shared object file: No such file or directory


Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/install_sources#common_installation_problems

for some common reasons and solutions.  Include the entire stack trace
above this error message when asking for help.

This is beyond my abilities to understand since it worked well in my own computer. I also tried suggestions on:

tensorflow import error in anaconda

It doesn't work as well.

aL_eX
  • 1,453
  • 2
  • 15
  • 30
ZHANG Juenjie
  • 501
  • 5
  • 20
  • You probably have not installed the `CUDA` on AWS. Which AWS instance are you using? and try to follow the complete steps listed in the [official documentation](https://www.tensorflow.org/install/install_linux). Just one discrepancy in the official documentation is that, they ask you to install `CUDA toolkit 8.0` but install `CUDA toolkit 9.0` and cuDNN version 7.0. and follow rest of the steps. – layog Jan 28 '18 at 08:16
  • @layog. It is p2.8xlarge. But When I connected to it, it has nothing in it!!! I have to install everything but fail at cuda. When i used lspci | grep -i nvidia to see the GPU information, it tells me nothing! Quite wierd!! – ZHANG Juenjie Jan 28 '18 at 09:14
  • You should try installing everything from scratch! CUDA, cuDNN and tensorflow – layog Jan 28 '18 at 09:30
  • I downloaded the cuda file according to the documentation, but I encountered an error: Driver: Installation Failed Toolkit: Installation skipped Samples: Installation skipped The driver installation is unable to locate the kernel source. Please make sure that the kernel source packages are installed and set up correctly. If you know that the kernel source packages are installed and set up correctly, you may pass the location of the kernel source with the '--kernel-source-path' flag. – ZHANG Juenjie Jan 28 '18 at 09:31
  • Skip the driver installation and install toolkit only – layog Jan 28 '18 at 09:33
  • @ layog I downloaded a file cuda_9.1.85_387.26_linux.run. It is the toolkit not the driver. What is going on? – ZHANG Juenjie Jan 28 '18 at 09:40
  • First of all you'll need cuda 9.0, the file you are downloading is 9.1 and it includes the driver as well. When you run this setup, after accepting the agreement, it asks to install the driver, decline driver installation at that time. – layog Jan 28 '18 at 09:42
  • @laylog I followed all the instructions and installed toolkit. But it still doesn't work! My god! – ZHANG Juenjie Jan 28 '18 at 11:10
  • @layogNow I have no idea about what to do. To re do the procedures ? – ZHANG Juenjie Jan 28 '18 at 11:10

0 Answers0