Segmentation fault (core dumped) in Ubuntu 18.04 using a RTX 2080 ti

Question

I've recently acquired a RTX 2080 ti in order to run some deep learning projects locally. I've tried to install tensorflow-gpu in Ubuntu 18.04 several times and the only guide that appears to work is the following : https://www.pugetsystems.com/labs/hpc/Install-TensorFlow-with-GPU-Support-the-Easy-Way-on-Ubuntu-18-04-without-installing-CUDA-1170/#look-at-the-job-run-with-tensorboard

However, when I begin running a script the following error shows up:

Using TensorFlow backend.
Train on 60000 samples, validate on 10000 samples
2019-01-09 14:49:06.748318: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2019-01-09 14:49:07.730143: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:964] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-01-09 14:49:07.732970: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties: 
name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.545
pciBusID: 0000:01:00.0
totalMemory: 10.73GiB freeMemory: 10.23GiB
2019-01-09 14:49:07.733071: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2019-01-09 14:49:30.666591: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-01-09 14:49:30.666636: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988]      0 
2019-01-09 14:49:30.666646: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0:   N 
2019-01-09 14:49:30.667094: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 9875 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:01:00.0, compute capability: 7.5)
Epoch 1/15
Segmentation fault (core dumped)

enter image description here

Could anyone provide me some feedback in how to make tensorflow work properly with my GPU?

Thank you.

Possible duplicate of [Illegal instruction(core dumped) tensorflow](https://stackoverflow.com/questions/49092527/illegal-instructioncore-dumped-tensorflow) — Amir, Jan 09 '19 at 16:43
Thanks Amir. Can you tell me how to fix this problem? I'm also going to attach the output of nvidia-smi — Marcos Soares, Jan 10 '19 at 18:38
On another note, I've tried to install tensorflow-gpu on Windows in the same computer and it works. Therefore my suspicion is that the problem is between the connection with the graphics card and the Ubuntu operating system, but I might be completely wrong — Marcos Soares, Jan 10 '19 at 18:44
I've downgraded it to TF 1.5.0. and still not working. Do you have any other suggestion? — Marcos Soares, Jan 11 '19 at 00:23
I faced the same issue and downgrade tensorflow version fix the problem. Some people report that version 1.3 fix their issue you can check it out as well. — Amir, Jan 11 '19 at 07:20

score 0 · Answer 1 · answered May 13 '19 at 18:25

You can try this here.

I'm on: RTX 2080, ubuntu 16.04

you need to install:

cuda 10.0
cuDNN v7.4.1.5
libcudnn7-dev_7.4.1.5-1+cuda10.0_amd64
libcudnn7-doc_7.4.1.5-1+cuda10.0_amd64
libcudnn7_7.4.1.5-1+cuda10.0_amd64
nvidia-machine-learning-repo-ubuntu1604_1.0.0-1_amd64

nvidia-smi

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.39       Driver Version: 418.39       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce RTX 2080    Off  | 00000000:02:00.0 Off |                  N/A |
| 22%   39C    P0    N/A /  N/A |      0MiB /  7951MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

of some reasen nvidia-smi show 10.1, but thats wrong

nvcc --version:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130

You can get it here step by step:

1. NVIDIA-Linux driver: https://www.nvidia.com/Download/index.aspx?lang=en-us
2. cuda https://developer.nvidia.com/cuda-downloads
3. cudnn: https://developer.nvidia.com/rdp/cudnn-download
4. install: libcudnn7-dev, libcudnn7-doc, libcudnn7_7
5. install: nvidia-machine-learning-repo-ubuntu1604_1.0.0-1_amd64.deb

To download libcudnn and nvidia-machine-learning:

https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64/

I'm useing:

tensorflow (1.13.1) tensorflow-gpu (1.13.1) tf-nightly-gpu (1.14.1.dev20190509)

Inside code eg (i got GPU work on LSTM in tensorflow !) top if your code start with:

config = tf.ConfigProto()
config.gpu_options.allow_growth = True
sess = tf.Session(config=config)
keras.backend.set_session(sess)

Segmentation fault (core dumped) in Ubuntu 18.04 using a RTX 2080 ti

1 Answers1