1

On Debian 10, I have 2 GPU cards RTX A6000 with NVlink hardware component and I would like to benefit of the potential combined power of both cards.

Currently, I have the following magma.make invoked by a Makefile :

CXX = nvcc -std=c++17 -O3
LAPACK = /opt/intel/oneapi/mkl/latest
LAPACK_ANOTHER=/opt/intel/mkl/lib/intel64
MAGMA = /usr/local/magma
INCLUDE_CUDA=/usr/local/cuda/include
LIBCUDA=/usr/local/cuda/lib64

SEARCH_DIRS_INCL=-I${MAGMA}/include -I${INCLUDE_CUDA} -I${LAPACK}/include
SEARCH_DIRS_LINK=-L${LAPACK}/lib/intel64 -L${LAPACK_ANOTHER} -L${LIBCUDA} -L${MAGMA}/lib

CXXFLAGS = -c -DMAGMA_ILP64 -DMKL_ILP64 -m64 ${SEARCH_DIRS_INCL}

LDFLAGS = ${SEARCH_DIRS_LINK} -lmkl_intel_lp64 -lmkl_gnu_thread -lmkl_core -lgomp -lcuda -lcudart -lcublas -lmagma -lpthread -lm -ldl -Xnvlink

SOURCES = main_magma.cpp XSAF_C_magma.cpp
EXECUTABLE = main_magma.exe

As you can see, I have use the last flag -Xnvlink but it generates the following error at compilation :

/usr/bin/ld: /usr/lib/gcc/x86_64-linux-gnu/8/../../../x86_64-linux-gnu/Scrt1.o: in function `_start':
(.text+0x20): undefined reference to `main'
collect2: error: ld returned 1 exit status
make: *** [Makefile:10: main_magma.exe] Error 1

How to use the right flag or options to include in the executable the combined power calls of 2 GPU with NVLink ?

Dharman
  • 30,962
  • 25
  • 85
  • 135
  • 1
    There is no relationship whatsoever between the nvlink hardware component and the -Xnvlink switch. The switch is not used to access the combined power of both GPUs. For that you need to use standard multi GPU programming methods which are covered in various tutorials available from nvidia GTC resources, and these have no connection to the usage of that switch. – Robert Crovella Nov 06 '21 at 00:56
  • @RobertCrovella . ok, good to know, have you got by chance a link of documentation to compile while taking into account of NVLink hardware component ? –  Nov 06 '21 at 00:59
  • @youpilat13: There isn’t one. No compiler is going to magically do anything related to multi-GPU. You have to write the code for it explicitly or use a library that contains such code. There is no compiler driven solution to your multi-GPU Magma question(s) – talonmies Nov 06 '21 at 01:22
  • There are no compilation settings needed to enable NVLink. It will be used automatically for device-to-device transfers (e.g. `cudaMemcpyPeerAsync`) when you first enable peer connections between the two devices. – Robert Crovella Nov 06 '21 at 01:33

1 Answers1

1

I have use the last flag -Xnvlink ...

Let's consult some documentation:

The following table lists some useful nvlink options which can be specified with nvcc option --nvlink-options.

4.2.9.2.1. --disable-warnings (-w)
Inhibit all warning messages.

4.2.9.2.2. --preserve-relocs (-preserve-relocs)
Preserve resolved relocations in linked executable.

4.2.9.2.3. --verbose (-v)
Enable verbose mode which prints code generation statistics.

4.2.9.2.4. --warning-as-error (-Werror)
Make all warnings into errors.

4.2.9.2.5. --suppress-arch-warning (-suppress-arch-warning)
Suppress the warning that otherwise is printed when object does not contain code for target arch.

4.2.9.2.6. --suppress-stack-size-warning (-suppress-stack-size-warning)
Suppress the warning that otherwise is printed when stack size cannot be determined.

4.2.9.2.7. --dump-callgraph (-dump-callgraph)
Dump information about the callgraph and register usage.

It should be obvious from that text this option is for controlling the device linker behaviour during compilation, and that none of this has anything to do with NVLINK, which is a hardware interconnect technology.

How to use the right flag or options to include in the executable the combined power calls of 2 GPU with NVLink ?

There is no flag or option. There is no compiler assisted multi-gpu support. You have to write your own multi-gpu code, or use a library where someone wrote it for you. If such multi-gpu code is present in your executable, it will work without the need for any special compiler options or flags during compilation.

talonmies
  • 70,661
  • 34
  • 192
  • 269