I am trying to build MXNet from source, on CentOS 7.2, with C++ language binding and GPU support. The build process includes two consecutive commands:
cmake -DUSE_CUDA=1 -DUSE_CUDNN=0 -DUSE_NCCL=0 -DUSE_OPENCV=0 -DUSE_MKLDNN=1 -DUSE_CPP_PACKAGE=1 -DCMAKE_BUILD_TYPE=Release -GNinja ..
ninja -v
And it eventually fails with the following message:
FAILED: CMakeFiles/mxnet.dir/src/operator/nn/concat.cu.o /fs/src/interfaces/terragen/CUDA/cuda-11/bin/nvcc -forward-unknown-to-host-compiler -DDMLC_CORE_USE_CMAKE -DDMLC_LOG_FATAL_THROW=1 -DDMLC_LOG_STACK_TRACE_SIZE=0 -DDMLC_MODERN_THREAD_LOCAL=0 -DDMLC_STRICT_CXX11 -DDMLC_USE_CXX11 -DDMLC_USE_CXX11=1 -DDMLC_USE_CXX14 -DMSHADOW_FORCE_STREAM -DMSHADOW_INT64_TENSOR_SIZE=0 -DMSHADOW_IN_CXX11 -DMSHADOW_USE_CBLAS=0 -DMSHADOW_USE_CUDA=1 -DMSHADOW_USE_CUDNN -DMSHADOW_USE_MKL=1 -DMSHADOW_USE_SSE -DMXNET_USE_BLAS_MKL=1 -DMXNET_USE_CUDA=1 -DMXNET_USE_LAPACK=1 -DMXNET_USE_LIBJPEG_TURBO=0 -DMXNET_USE_MKLDNN=1 -DMXNET_USE_NCCL=1 -DMXNET_USE_NVML=0 -DMXNET_USE_NVTX=1 -DMXNET_USE_OPENCV=0 -DMXNET_USE_OPENMP=1 -DMXNET_USE_OPERATOR_TUNING=1 -DMXNET_USE_SIGNAL_HANDLER=1 -DNDEBUG=1 -DUSE_CUDNN -D__USE_XOPEN2K8 -Dmxnet_EXPORTS -I../3rdparty/mkldnn/include -I3rdparty/mkldnn/include -I../include -I../src -I../3rdparty/tvm/nnvm/include -I../3rdparty/tvm/include -I../3rdparty/dmlc-core/include -I../3rdparty/dlpack/include -I../3rdparty/mshadow -I../3rdparty/mkldnn/src/../include -I3rdparty/dmlc-core/include -isystem=/depot/intel/parallel_studio_xe_2019_update5_cluster_edition/mkl/include -isystem=/fs/src/interfaces/terragen/CUDA/cuda-11/include -gencode arch=compute_30,code=sm_30 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -O3 -DNDEBUG -Xcompiler=-fPIC --expt-relaxed-constexpr -std=c++14 -MD -MT CMakeFiles/mxnet.dir/src/operator/nn/concat.cu.o -MF CMakeFiles/mxnet.dir/src/operator/nn/concat.cu.o.d -x cu -c ../src/operator/nn/concat.cu -o CMakeFiles/mxnet.dir/src/operator/nn/concat.cu.o nvcc fatal : Unsupported gpu architecture 'compute_30'
These are the dependencies:
- CentOS 7.2
- gcc 8.2.0
- glibc - the version of gcc 9.1.0 (there is a reason for this mess)
- ld 2.23.52.0.1-55.el7 20130226
- cmake 3.17.2
- python 3.8.3
- ninja 1.10.0
- CUDA 11
- Nvidia GPU Tesla P100
Another piece of useful information is that after running the first command:
cmake -DUSE_CUDA=1 -DUSE_CUDNN=0 -DUSE_NCCL=0 -DUSE_OPENCV=0 -DUSE_MKLDNN=1 -DUSE_CPP_PACKAGE=1 -DCMAKE_BUILD_TYPE=Release -GNinja ..
We get (among other lines):
-- Automatic GPU detection failed. Building for common architectures. -- Autodetected CUDA architecture(s): 3.0;5.0;5.2;6.0;6.1;7.0;7.5 -- CUDA: Using the following NVCC architecture flags -gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75 -- Found CUDAToolkit: /remote/vgeuclideproj1/TerraGen/cuda/cuda-11/include (found version "11.0.221") -- Could NOT find NVML (missing: NVML_INCLUDE_DIRS) CMake Warning at CMakeLists.txt:566 (message): Could not find NVML libraries
Which is not considered as an error or a warning, but may provide further explanation.
Edit (Solution): Apparently the underlying GPUs did not support CUDA 11. Reverting to CUDA 10.2 (along with all relevant dependencies) solved the problem. In addition I have found that the method of architecture detection can be controlled in CMakeLists.txt by the MXNET_CUDA_ARCH variable. Hope it can help future problems.